开发者

How to obtain all subsequence combinations of a String (in Java, or C++ etc)

Let's say I've a string "12345" I should obtain all subsequence combinations of this string such as:

  1. --> 1 2 3 4 5
  2. --> 12 13 14 15 23 24 25 34 35 45
  3. --> 123 124 125 234 235 345
  4. --> 1234 1235 1245 1345 2345
  5. --> 12345

Please note that I grouped them in different number of chars but not changed their order. I need a method/function does th开发者_运维知识库at.


You want a powerset. Here are all the questions on StackOverflow that mention powersets or power sets.

Here is a basic implementation in python:

def powerset(s):
    n = len(s)
    masks = [1<<j for j in xrange(n)]
    for i in xrange(2**n):
        yield [s[j] for j in range(n) if (masks[j] & i)]


if __name__ == '__main__':
    for elem in powerset([1,2,3,4,5]):
        print elem

And here is its output:

[]
[1]
[2]
[1, 2]
[3]
[1, 3]
[2, 3]
[1, 2, 3]
[4]
[1, 4]
[2, 4]
[1, 2, 4]
[3, 4]
[1, 3, 4]
[2, 3, 4]
[1, 2, 3, 4]
[5]
[1, 5]
[2, 5]
[1, 2, 5]
[3, 5]
[1, 3, 5]
[2, 3, 5]
[1, 2, 3, 5]
[4, 5]
[1, 4, 5]
[2, 4, 5]
[1, 2, 4, 5]
[3, 4, 5]
[1, 3, 4, 5]
[2, 3, 4, 5]
[1, 2, 3, 4, 5]

Notice that its first result is the empty set. Change the iteration from this for i in xrange(2**n): to this for i in xrange(1, 2**n): if you want to skip an empty set.

Here is the code adapted to produce string output:

def powerset(s):
    n = len(s)
    masks = [1<<j for j in xrange(n)]
    for i in xrange(2**n):
        yield "".join([str(s[j]) for j in range(n) if (masks[j] & i)])

Edit 2009-10-24

Okay, I see you are partial to an implementation in Java. I don't know Java, so I'll meet you halfway and give you code in C#:

    static public IEnumerable<IList<T>> powerset<T>(IList<T> s)
    {
        int n = s.Count;
        int[] masks = new int[n];
        for (int i = 0; i < n; i++)
            masks[i] = (1 << i);
        for (int i = 0; i < (1 << n); i++)
        {
            List<T> newList = new List<T>(n);
            for (int j = 0; j < n; j++)
                if ((masks[j] & i) != 0)
                    newList.Add(s[j]);
            yield return newList;
        }
    }


The simplest algorithm for generating subsets of a set of size N is to consider all binary numbers using N bits. Each position in the number represents an element from the set. If a bit in the number is 1, the corresponding set element is in the subset, otherwise the element isn't in the subset. Since the bits in a number are ordered, this preserves the ordering of the original set.

References:

  1. "Efficiently Enumerating the Subsets of a Set"; Loughry, Hemert and Schoofs
  2. "Generating Subsets"; Stony Brook Algorithm Repository


way way cleaner approach can be achieved through recursion as follows.

Public class StrManipulation{

    public static void combinations(String suffix,String prefix){
        if(prefix.length()<0)return;
        System.out.println(suffix);
        for(int i=0;i<prefix.length();i++)
         combinations(suffix+prefix.charAt(i),prefix.substring(i+1,prefix.length()));
    }

    public static void main (String args[]){
        combinations("","12345");
        }
}


In C++ given the following routine:

template <typename Iterator>
bool next_combination(const Iterator first, Iterator k, const Iterator last)
{
   /* Credits: Mark Nelson http://marknelson.us */
   if ((first == last) || (first == k) || (last == k))
      return false;
   Iterator i1 = first;
   Iterator i2 = last;
   ++i1;
   if (last == i1)
      return false;
   i1 = last;
   --i1;
   i1 = k;
   --i2;
   while (first != i1)
   {
      if (*--i1 < *i2)
      {
         Iterator j = k;
         while (!(*i1 < *j)) ++j;
         std::iter_swap(i1,j);
         ++i1;
         ++j;
         i2 = k;
         std::rotate(i1,j,last);
         while (last != j)
         {
            ++j;
            ++i2;
         }
         std::rotate(k,i2,last);
         return true;
      }
   }
   std::rotate(first,k,last);
   return false;
}

You can then proceed to do the following:

std::string s = "12345";
for(std::size_t i = 1; i <= s.size(); ++i)
{
   do
   {
      std::cout << std::string(s.begin(),s.begin() + i) << std::endl;
   }
   while(next_combination(s.begin(),s.begin() + i,s.end()));
}


using python, the itertools module defines a combinations() method which does just what you need.

from itertools import *
list(combinations( '12345', 2 ))

will give you:

[('1', '2'), ('1', '3'), ('1', '4'), ('1', '5'), ('2', '3'), ('2', '4'), ('2', '5'), ('3', '4'), ('3', '5'), ('4', '5')]


Java implementation of outis' answer, taking the input strings as args.

import java.util.ArrayList;
import java.util.List;

public class Combo {

  public static void main(String[] args) {
    List<String> results = new ArrayList<String>();
    for ( int i = 1; i <= (1<<(args.length))-1; i++ ) {
      StringBuilder builder = new StringBuilder();
      for ( int j = 0; j < args.length; j++ ) {
        if ( (i & (1<<j)) != 0) {
          builder.append(args[j]);
        }
      }
      results.add(builder.toString());
    }
    System.out.println( results );
  }
}

Here's a run.

> javac Combo.java
> java Combo A B C
[A, B, AB, C, AC, BC, ABC]


You can use the following class for this (in Java):

class Combinations {

  String input;
  StringBuilder cur;

  private void next(int pos, int reminder) {
    cur.append(input.charAt(pos));

    if (reminder == 1) {
      System.out.println(cur);
    } else {
      for (int i = pos + 1; i + reminder - 1 <= input.length(); i++)
        next(i, reminder - 1);
    }
    cur.deleteCharAt(cur.length() - 1);
  }

  public void generate(String input) {
    cur = new StringBuilder();
    this.input = input;
    for (int length = 1; length <= input.length(); length++)
      for (int pos = 0; pos + length <= input.length(); pos++)
        next(pos, length);
  }
}

To run your example use the following code:

new Combinations().generate("12345");

The order of the output is the same as in example. It does not require to store all subsets and then sort them to obtain the order you described.


The code to generate all possible combinations of strings is given in java. The all possible combinations of string of length 4 is 2 ^ 4 (2 raised to the power 4). In general for a string of length n the possible combinations are 2 ^ n (2 raised to the power n). Hence the code:

    class Perms
    {
    public void permsOfString(String a)
      {
     int x = 1;

     /* 
          Computes 2^string length

     */

     for(int i = 0;i<a.length() ;i++)
     {
         x = x * 2;
     }
     /*
            Iterate through all the possible combinations using a binary value of the number

      */
     for(int i = 1 ;i<x;i++)
     {

         String binStr = Integer.toBinaryString(i); // Convert i to binary string 
         for(int j = binStr.length() ; j <  a.length() ;j++)
         {
             binStr = "0"+binStr; // left pad with 0s
         }
   /*loop through the binary string if a character at the string is '1' note the    index,then display the character of the given string with that index */

          for(int k = 0; k <binStr.length();k++)
          {
             if(binStr.charAt(k) == '0') continue;
             else
             {
                 System.out.print(a.charAt(k));
             }

          }
         System.out.println();

     }

    }
    public static void main(String[]s)
  {
Perms p = new Perms();
p.permsOfString("abcd");
   }
} 


Adrien Plisson's answer shows how one retrieves all subsequences of a specified length in Python (for arbitrary sequence data types). The OP specifies that he works with strings, and that he wants all subsequences. Thus, using itertools.combinations we define:

>>> from itertools import combinations
>>> def subseq_combos(inp):
...     return (''.join(s) for r in range(len(inp) + 1) for s in combinations(inp, r))
... 
>>> list(subseq_combos('12345'))
['', '1', '2', '3', '4', '5', '12', '13', '14', '15', '23', '24', '25', '34', '35', '45', '123', '124', '125', '134', '135', '145', '234', '235', '245', '345', '1234', '1235', '1245', '1345', '2345', '12345']

(If the empty subsequence should be omitted, then use range(1, len(inp) + 1)).)


oops, wrong answer:

Subsequences of a certain length in Python:

def subseqs(seq, length):
    for i in xrange(len(seq) - length + 1):
        yield seq[i:i+length]

Used like this:

for each in subseqs("hello", 3):
    print each

prints:

hel
ell
llo

To generate all subsequences do this:

for i in xrange(len("hello")):
    for each in subseqs("hello", i + 1):
        print each

prints:

h
e
l
l
o
he
el
ll
lo
hel
ell
llo
hell
ello
hello

Mick.

Now I see, you wanted subsets, not sublists.


C implementation

//Usage
combinations((char*)"",(char*)"12346897909787");


void combinations(char* suffix,char* prefix){
    if(NULL ==prefix || NULL == suffix){ return ;}
    int prefixLen = strlen(prefix);
    printf("\n[%s]",suffix);
    int slen  = strlen(suffix);
    char* s   = (char*)malloc(slen+2);
    s[slen+1] = '\0';
    for(int i=0;i<prefixLen;i++){
        strcpy(s,suffix);
        s[slen]  = prefix[i];
        int npfl = prefixLen-(i+1);
        char* p  = (char*) malloc(npfl+1);
        p[npfl]  = '\0';
        strcpy(p,prefix+i+1);
        combinations(s,p);
        free(p);
    }
    free(s);
}


C++ solution:

#include<iostream>
#include<string>

using namespace std;

int sub[10];

void next(int max, int length) {

    int pos = length - 1;

    //find first digit that can be increased
    while(pos >= 0)
    {
        if(sub[pos] == max - (length - 1 - pos))
            pos--;

        else
            break;
    }

        sub[pos]++; //increase digit

        //update other digits
        for(int a = pos+1; a < length; a++)
            sub[a] = sub[a-1] + 1;

}

int main()
{
    string word;
    cin >> word; 

    int max = word.length() - 1; //max value


    for(int n=1; n <= max+1; n++)
    {

        cout << n << "\n----\n";

        for(int i = 0; i < n; i++)
        {
            sub[i] = i;
        }

        for(int a = 0; ; a++)
        {               
            for(int b=0; b < n; b++)
                cout << word[sub[b]];

            cout << '\n';

            if(sub[0] == max - (n - 1))
                break;

            else
                next(max, n); //maximum value and last position
        }   

        cout << '\n';

    }   


    return 0;
 }
> for input :Sigma
> output is
1
----
s
i
g
m
a

2
----
si
sg
sm
sa
ig
im
ia
gm
ga
ma

3
----
sig
sim
sia
sgm
sga
sma
igm
iga
ima
gma

4
----
sigm
siga
sima
sgma
igma

5
----
sigma


public class Sub {

    public static void main(String[] args) {
          String str="ADDIS";
          Sub.DisplaySubsequence( str.toCharArray(), 
          str.toCharArray().length,0 );
     }
     public static void DisplaySubsequence(char[] set, int n, int index){
         
        if(n>index){
            
                String x =String.valueOf(set[index]);
                System.out.println(x);
                String concat=x;
                for(int j=index+1; j<n; j++){
                        concat=concat.concat(String.valueOf(set[j]));
                        System.out.println(concat);
                 }
                 DisplaySubsequence(set, n,index+1);
         }
         
            return 0;
         
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜