How to compute de Bruijn sequences for non-power-of-two-sized alphabets?

2023-01-20 23:42 问答作者：

I'm trying to compute de Bruijn sequences for alphabets which have a number of characters which is not a power of two.

For alphabets with a 2^k characters, calculating de Bruijn sequences is easy: There are several simple rules, such as "Prefer Ones" and "Prefer Opposites" which work for generating B(2,n). B(2^k,n) is exactly the same as B(2,kn), if you read the 1s and 0s as binary codes for the actual characters in your alphabet. E.g., you can interpret B(2,8n) as being over n-length sequences of bytes.

Prefer Ones is quite simple: Write n zeros. Then, always write a one unless it would cause the repetition of an n-length string; otherwise, write a zero.

Presently, I don't see how to generalize such rules to non-power-of-two-sized alphabets.

There's a general method for calculating de Bruijn sequences via graphs: Let each n-length sequence generated by your alphabet be a node; put an edge from A to B iff the rightmost n-1 characters of A are the same as the leftmost n-1 characters of B. Label each edge with the last character of the string in the head vertex. Any Eulerian path through this graph will generate a de Bruijn sequence, and the peculiar construction we used guarantees that there will be at least one such path. We can use Fleury's Algorithm to (nondeterministically) construct an Eulerian path:

Choose a vertex.

开发者_高级运维

Leave that vertex via some edge and delete that edge, only choosing edges whose deletion would disconnect the vertex from the graph if there is no alternative.
Append to your string the label of the edge you just deleted.
Goto 2 until all edges are gone.

The resulting string will be a de Bruijn sequence.

This algorithm is somewhat more complex to implement than Prefer Ones. The simplicity of Prefer Ones is that one needs only to consult the output already generated to determine what to do. Is there a straightforward way to generalize Prefer Ones (or, possibly Prefer Opposites) to alphabets of non-power-of-two sizes?

This is my C++ implementation of the algorithm in Figure 1 from a paper by Sawada and Ruskey:

void debruijn(unsigned int t,
              unsigned int p,
              const unsigned int k,
              const unsigned int n,
              unsigned int* a,
              boost::function<void (unsigned int*,unsigned int*)> callback)
{
  if (t > n) {
    // we want only necklaces, not pre-necklaces or Lyndon words
    if (n % p == 0) {
      callback(a+1, a+p+1);
    }
  }
  else {
    a[t] = a[t-p];

    debruijn(t+1, p, k, n, a, callback);

    for (unsigned int j = a[t-p]+1; j < k; ++j) {
      a[t] = j;
      debruijn(t+1, t, k, n, a, callback);
    }
  }
}

struct seq_printer {
  const std::vector<char>& _alpha;

  seq_printer(const std::vector<char>& alpha) : _alpha(alpha) {}

  void operator() (unsigned int* a, unsigned int* a_end) const {
    for (unsigned int* i = a; i < a_end; ++i) {
      std::cout << _alpha[*i];
    }
  }
};

...

std::vector<char> alpha;
alpha.push_back('a');
alpha.push_back('b');
alpha.push_back('c');

unsigned int* a = new unsigned int[N+1];
a[0] = 0;

debruijn(1, 1, alpha.size(), N, a, seq_printer(alpha));
if (N > 1) std::cout << alpha[0];
std::cout << std::endl;

delete[] a;

The full reference for the paper is: Joe Sawada and Frank Ruskey, "An Efficient Algorithm for Generating Necklaces with Fixed Density", SIAM Journal of Computing 29:671-684, 1999.

According to this web page at the combinatorial group of the CS department at UVic, there's a result due to Fredericksen that you can generate a de Bruijn sequence (in fact, the lexicographically smallest one) by concatenating "the lexicographic sequence of Lyndon words of lengths divisible by n". There's even source code to build the sequence that you can request.

Are you only interested in a generalization of Prefer Ones or do you just want a not so complex algorithm? If the latter is true then maybe Frank Ruskey's recursive implementation could be of help.

A year ago I translated that one to Ruby.

# De Bruijn sequence
# Original implementation by Frank Ruskey (1994)
# translated to C by Joe Sawada
# and further translated to Ruby by Jonas Elfström (2009)

@n=4
@k=10
@a=[0]
@sequence=[]

def debruijn(t, p, alike)
  if t>@n
    if @n%p==0
      1.upto(p) {|j| @sequence<<@a[j]}
    end
  else
    @a[t]=@a[t-p]
    if @a[t]>0
      debruijn(t+1,p,alike+1)
    else
      debruijn(t+1,p,alike)
    end
    (@a[t-p]+1).upto(@k-1) {|j|
      @a[t]=j
      debruijn(t+1,t,alike+1)
    }
  end
end

debruijn(1,1,0)
print @sequence.join

Uckelman noticed that the alike variable does nothing. The following produces the same sequence.

@n=4
@k=10
@a=[0]
@sequence=[]

def debruijn(t, p)
  if t>@n
    if @n%p==0
      1.upto(p) {|j| @sequence<<@a[j]}
    end
  else
    @a[t]=@a[t-p]
    debruijn(t+1,p)
    (@a[t-p]+1).upto(@k-1) {|j|
      @a[t]=j
      debruijn(t+1,t)
    }
  end
end

debruijn(1,1)
print @sequence.join

or you can use:

def de_bruijn(k, n):
    a = [0] * k * n
    sequence = []
    def db(t, p):
        if t > n:
            if n % p == 0:
                for j in range(1, p + 1):
                    sequence.append(a[j])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(a[t - p] + 1, k):
                a[t] = j
                db(t + 1, t)
    db(1, 1)
    return sequence

print de_bruijn(2,9)

Duval's algorithm does the same thing iteratively (In Python this time):

def debruijn(k, n):
    v = [0 for _ in range(n)]
    l = 1
    r = []
    while True:
        if n % l == 0:
            r.extend(v[0:l])
        for i in range(l, n):
            v[i] = v[i-l]
        l = n
        while l > 0 and v[l-1] >= k-1:
            l-=1
        if l == 0:
            break
        v[l-1] += 1
    return r

print(debruijn(3,5))

Based on @stefan-gruenwald's code, which lacks a simply categorizable set of words. Although I'm not capable (yet) to emend it, I wrote some lines for finding the error, which seems to be

import itertools

def debruijn (k, n) :
    v =  [ 0 for _ in range (n) ]
    l = 1
    r =  []
    while True :
        if n % l == 0 :
            r.extend (v [0:l])
        for i in range (l, n) :
            v [i] = v [i-l]
        l = n
        while l > 0 and v [l-1] >= k-1 :
            l -= 1
        if l == 0 :
            break
        v [l-1] += 1
    return r

K = int (input ( 'k ' ) )
N = int (input ( 'n ' ) )

for k in range (K) :                                # length of alphabet
    for n in range (N) :                            # length of word(s)
        List = debruijn (k, n)
        l = ''
        for L in List :
            l += str (L)
        L = itertools.product (range (k), repeat = n)
        print ( 'alphabet k', k, '\tword n', n, str ('|' + l + '|') )
        for a in L :
            searchstr = ''
            for A in a :
                searchstr += str (A)
            if not searchstr in l :
                print ( searchstr, end = ' ' )
        print ()

Also saveable in a file:

import itertools

def debruijn (k, n) :
    v =  [ 0 for _ in range (n) ]
    l = 1
    r =  []
    while True :
        if n % l == 0 :
            r.extend (v [0:l])
        for i in range (l, n) :
            v [i] = v [i-l]
        l = n
        while l > 0 and v [l-1] >= k-1 :
            l -= 1
        if l == 0 :
            break
        v [l-1] += 1
    return r

K = int (input ( 'k ' ) )
N = int (input ( 'n ' ) )

for k in range (K) :                                # length of alphabet
    for n in range (N) :                            # length of word(s)
        List = debruijn (k, n)
        l = ''
        for L in List :
            l += str (L)
        L = itertools.product (range (k), repeat = n)
        with open (str (k) + '-' + str (n), 'w') as f :
            f.write ( 'alphabet length ' + str (k) + '\tword length ' + str (n) + '\n' + l + '\nnot in:\n' )
            for a in L :
                searchstr = ''
                for A in a :
                    searchstr += str (A)
                if not searchstr in l :
                    f.write ( searchstr + ' ' )

继续阅读：algorithm math string

How to compute de Bruijn sequences for non-power-of-two-sized alphabets?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？