Algorithm to get the rank of a particular combination of element from a number of sets
I have a bunch of sets say S1, S2, S3, ... Each set have different elements. Say S1 = { A, B, C}, S2 = { X, Y }, S3 = { P, Q, R, T }
There exists a combination of these sets K = { S1, S2, S3 }. For example, an instance of this combination would be { A, X, P }. There are obviously 3 x 2 x 4 = 24 combinations possible. What I need is the "rank" of a particular combination, calculated using simple ordered enumeration from left to right, and the vice versa.
Obviously, I can calculate this easily by simply enumerating all the combinations and comparing it to the requested combination while keeping a counter, but I need an efficient algorithm as my sets can contain up to 20000 elements each, and number of combined sets for some case are > 10.
开发者_运维知识库BTW, I am aware of the thread Compute rank of a combination? here in stack overflow. But, unfortunately, it is not applicable here as my combinations are made out of different sized sets for different positions
I would appreciate an implementation in C# but other languages or pseudo-code would also be very helpful.
Any suggestions, please
kemal
UPDATE: @spinning_plane & @aasmund. Thank you for the answers. They both provide me with the same formula for calculating the rank.
But, I also need the other way around. i.e. to get the combination for a given rank (zero based). For example, given the rank 0, the result would be {A,X,P} ,for 3 {A, X, R }, etc. Anyone with an algorithm, please?
Think of your set as a number whose possible values for each digit is the size of the associated set. To see this, supposing the size of each set S1...S3 is the same, say 2 for simplicity. To calculate the rank of the set, you would simply interpret K as a binary number and translate it to its base-10 equivalent. rank(x) is simply the 0 based index of the element in the set.
rank(A)*2^0 + rank(X)*2^1 + rank(P)*2^2
Now to generalize this to the case when the sets can be different size, we can write out an expression for the calculation
rank(A) + rank(X)*len(S1) + rank(P)*len(S2)*len(S1) ... etc.
In psuedo-code
input = {'a','b','x'}
output = 0;
cumulative = 1;
for i in range(len(K)):
output += cumulative*rank(input[i],K[i]) # this returns the index of input[i] in set K[i]
cumulative*=len(K[i])
Is this what the full "ranked sequence" look like?
0: {A, X, P}
1: {A, X, Q}
2: {A, X, R}
3: {A, X, S}
4: {A, Y, P}
5: {A, Y, Q}
...
If so, let the sets be numbered from right to left as S1, S2, ..., Sn, and let the ranks of the chosen elements within their own sets (e.g. A=0, B=1, C=2) be r1, r2, ..., rn. The formula should then be
rn * |S(n-1)| * ... * |S2| * |S1| + ... + r3 * |S2| * |S1| + r2 * |S1| + r1
Why? Let's say that we pick {C, Y, Q}
. Their zero-based ranks within their respective sets are 2, 1, and 2, respectively. Because the leftmost rank is 2, it means that in order to get to that part of the ranking sequence, we need to let the rightmost and middle positions perform two full "rounds", for a total of (in this case) r2 * |S2| * |S1| = 2 * 2 * 4 = 16
rows. Then, we must skip past 1 round of the rightmost position in order to reach Y, and so on.
Edit: The formula can be simplified to
(((...) * |S3| + r3) * |S2| + r2) * |S1| + r1
(and it certainly ought to be computed in that way). Watch out for integer overflow, by the way...
精彩评论