开发者

Mathematica - Exclude a String from a String Search

I'm trying to select the most frequently occurring key words on a table. I need to search for the number of occurrences of a word in a list that DOES NOT include a given second, third, fourth, etc. word.

For example, I need to search for the number of times the word "lollypop" appears in a list that does not include the word "candy".

This code will return the number of times the word "lollypop" appears:

rt = Parallelize@
 Cases[MemoizeTable["Candy_table.txt"], 
  x_List /; 
   MemberQ[x, 
    s_String /; 
     StringMatchQ[s, ("*lollypop*"), IgnoreCase -> True]], {1}];

I tried adding StringFreeQ to exclude "candy", and I tried adding Nor where one would add Or in the string search, but I wasn't sure how to do that/where to put it/them..?

I need a "this" BUT NOT "that" code, basicall开发者_开发技巧y.


excludeList = {"candy", "other"};
toCount = "lollypop";

numberOfToCount[list_, tocount_, excludeList_] := 
  If[And @@ ((! MemberQ[list, #]) & /@ excludeList), 
     Count[list, tocount], 
     "Excluded"];

Usage:

numberOfToCount[{"lollypop", "lollypop", "the beatles"}, toCount, excludeList]
numberOfToCount[{"lollypop", "lollypop", "candy"}, toCount, excludeList]

(*
-> 2
-> Excluded
*)


To obtain the words, try eg

dl = DictionaryLookup[];

Select[dl, 
 StringFreeQ[#, ___ ~~ "ies" ~~ ___] && 
 StringMatchQ[#, ___ ~~ "loll" ~~ ___] &]
 (*
 -> {"loll", "lolled", "lolling", "lollipop", "lollipops", "lollop", "lolloped", "lolloping", "lollops", "lolls", "lolly"}
 *)

and you can count them by adding //Count at the end (say).

EDIT: It seems I misunderstood your question. If what you're asking is: count the number of times "canapes" appears in a list, which list does not include "modifiable", then:

dl2 = {"titivation", "curving", "doppelgangers", "objurgations", 
"canapes", "invaluable", "modifiable", "dissect", "ominousness", 
"sentinel"}

If[Not@MemberQ[dl2, "modifiable"], Count[dl2, "canapes"], False]
(*
-> False
*)

while

If[Not@MemberQ[dl2, "plate"], Count[dl2, "canapes"], False]

excludes "plate", thus giving 1 for this list.

But I am confused by your comments ("This code is returning a preliminary selection of lists that I'm running other searches on, so I need to keep the list intact" which list? they're all kept intact by the code above) so I must still be missing something.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜