开发者

Finding sets that are a subset of a specific set

Lets say I have 4 different values A,B,C,D with sets of identifiers attached.

A={1,2,3,4,5}

B={8,9,4}

C={3,4,5}

D={12,8}

And given set S of identifiers {1,30,3,4,5,12,8} I want it to return C and D. i.e. retrieve all sets from a group of sets for which S is a superset.

Is there any algorithms to perform this task efficiently (Preferably with low memory complexity. Using external device for storing data is not an option) ? A trivial solution would be for each member in the superset S retrieve list of sets that i开发者_Go百科nclude that member (basically inverted index) and for each returned set check that all of his members are in the superset. Unfortunately because on average the superset will include at least one member for each set there is a significant and unacceptable performance hit with this approach.

I am trying to do this in Java. Set consist of integers and the value they identify is an object. Collection of sets is not static and bound to change during the course of execution. There will be some limit on the set number though. Set size is not limited. But on average it's between 1 and 20.


  1. Go through each element x in S.
  2. For each set t for which xt, increment a counter—call it tcount—associated with t.
  3. After all that, for each set t for which tcount = | t |, you know that tS.

Application.

After step 2.

Acount = 4,
Bcount = 1,
Ccount = 3,
Dcount = 2.

Step 3 processing.

Acount ≠ |A| (4 ≠ 5) — Reject,
Bcount ≠ |B| (1 ≠ 3) — Reject,
Ccount = |C| (3 = 3) — Accept,
Dcount = |D| (2 = 2) — Accept.


Note after cgkanchi note: The following algorithm is under the assumption that you don't really use sets but arrays. If that is not the case, you should look for a method which implements intersection of sets and then the problem is trivial. This is about how to implement the notion of intersection using arrays.

  1. Sort all sets using heapsort for in-place sorting O(1) space. It runs in O(nlogn) and soon enough it will pay you back.
  2. For each set L of all sets:

    2.1. j = 0

    2.2. For the i element in L:

    2.2.1. Starting from j element find L[i] in S for which L[i] = S[j] else reject. If L and S and large enough use binary search or interpolation search (for the second one, have a look at your data distibution)

    2.3. Accept


As for Java, I’d use a Hashtable for the lookup table of the elements in S. Then for each element in X, the set you want to test if it’s a subset of S, test if it’s in the lookup table. If all elements of X are also in S, then S is a superset of X.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜