开发者

finding sublist in lists with map/select in mathematica

I have, in Wolfram Mathematica 8.0, a nested list like

nList = {{a,b},{f,g},{n,o}}

and a normal list like

lList = {a,b,c,k,m,n,o,z}

and i want to check if all the sublists in nList are in lList (in the example a,b and n,o are there but not f,g)

I've done it using For[,,,] and using index... can someone enlighten me in using functions like Map/Thread/Select to do it in开发者_JS百科 one pass.

Edit: If nList contains a,b, lList must contain a,b and not a,c,b or b,a or b,c,a


Assuming that you don't care about element ordering, here is one way:

In[20]:= Complement[Flatten[nList],lList] ==={}

Out[20]= False

EDIT

If the order matters, then here is one way:

In[29]:= And@@(MatchQ[lList,{___,PatternSequence[##],___}]&@@@nList)

Out[29]= False

For large number of sub-lists, this may be faster:

In[34]:= 
Union[ReplaceList[lList,
       {___,x:Alternatives@@(PatternSequence@@@nList),___}:>{x}]]===Union[nList]

Out[34]= False

This works as follows: ReplaceList is a very nice but often ignored command which returns a list of all possible expressions which could be obtained with the pattern-matcher trying to apply the rules in all possible ways to an expression. This is in contrast with the way the pattern-matcher usually works, where it stops upon the first successful match. The PatternSequence is a relatively new addition to the Mathematica pattern language, which allows us to give an identity to a given sequence of expression, treating it as a pattern. This allowed us to construct the alternative pattern, so the resulting pattern is saying: the sequence of any sublist in any place in the main list is collected and put back to list braces, forming back the sublist. We get as many newly formed sublists as there are sequences of the original sublists in the larger list. If all sublists are present, then Union on the resulting list should be the same as Union of the original sublist list.

Here are the benchmarks (I took a list of integers, and overlapping sublists generated by Partition):

In[39]:= tst = Range[1000];

In[41]:= sub = Partition[tst, 2, 1];

In[43]:= 
And @@ (MatchQ[tst, {___, PatternSequence[##], ___}] & @@@ sub) // Timing

Out[43]= {3.094, True}

In[45]:= 
Union[ReplaceList[tst, {___,x : Alternatives @@ (PatternSequence @@@ sub), ___} 
     :> {x}]] ===  Union[sub] // Timing

Out[45]= {0.11, True}

Conceptually, the reason why the second method is faster is that it does its work in the single run through the list (performed internally by ReplaceList), while the first solution explicitly iterates through the big list for each sub-list.

EDIT 2 - Performance

If performance is really an issue, then the following code is yet much faster:

And @@ (With[{part = Partition[lList, Length[#[[1]]], 1]},
 Complement[#, part] === {}] & /@SplitBy[SortBy[nList, Length], Length])

For example, on our benchmarks:

In[54]:= And@@(With[{part = Partition[tst,Length[#[[1]]],1]},
       Complement[#,part]==={}]&/@SplitBy[SortBy[sub,Length],Length])//Timing

Out[54]= {0.,True}

EDIT 3

Per suggestion of @Mr.Wizard, the following performance improvement can be made:

Scan[
 If[With[{part = Partition[lList, Length[#[[1]]], 1]},
   Complement[#, part] =!= {}], Return[False]] &,
 SplitBy[SortBy[nList, Length], Length]
] === Null

Here, the as soon as we get a negative answer from sub-lists of a given length, sublists of other lengths will not be checked, since we already know that the answer is negative (False). If Scan completes without Return, it will return Null, which will mean that lList contains all of the sublists in nList.


You could use pattern matching to do this job:

In[69]:= nList = {{a, b}, {f, g}, {n, o}};
lList = {a, b, c, k, m, n, o, z};

The @@@ is an alias for Apply at level {1}. The level 1 of nList contains your pairs, and applying replaces the head List in them with the function to the right of @@@.

In[71]:= MatchQ[lList, {___, ##, ___}] & @@@ nList

Out[71]= {True, False, True}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜