Using regexp to search for the string matching multiple words arranged in random
How can I write the regexp to match multiple words in random order?
For example, let's assume the following lines:
Dave Imma Car Pom Dive
Dive Dome Dare
Imma Car Ry开发者_Go百科an
Pyro Dave Imma Dive
Lunar Happy Dave
I want to search the string for the one matching "Dave" "Imma" and "Dive", expecting the 1st and 4th line. Is this possible?
If you insist on doing this with regex, you can use lookahead:
s.matches("(?=.*Dave)(?=.*Imma)(?=.*Dive).*")
Regex is not the most efficient way of doing this, though.
in *nix, you can use awk
if its in order
awk '/Dave.*Imma.*Dive/' file
if its not in order
awk '/Dave/ && /Imma/ && /Dive/' file
if ((matches "/(Dave|Imma|Dive) (Dave|Imma|Dive) (Dave|Imma|Dive)/")
&& (contains("Dave")) && (contains("Imma")) && (contains("Dive")))
{
// this will work in 90% of cases.
}
I don't think it's possible to do this exactly, though. Sorry.
String[] lines = fullData.split("\n");
String[] names = {"Dave", "Imma", "Dive"};
ArrayList matches = new ArrayList();
for(int i=0; i<lines.size(); i++){
for(String name : names){
// If any of the names in the list isn't found
// then this line isn't a match
if(!lines[i].contains(name)){
continue;
}
}
// If we made it this far, all of the names were found
matches.add(i);
}
// matches now contains {1, 4}
If you don't need to know where the matches are, it can be simplified to:
String[] lines = fullData.split("\n");
String[] names = {"Dave", "Imma", "Dive"};
for(String line : lines){
for(String name : names){
// If any of the names in the list isn't found
// then this line isn't a match
if(!line.contains(name)){
continue;
}
}
// If we made it this far, all of the names were found
// Do something
}
Should the following lines match?
Dave Imma Dave
Dave Imma Dive Imma
I'm guessing the first one shouldn't because it doesn't contain all three names, but are duplicates okay? If not, this regex does the trick:
^(?:\b(?:(?!(?:Dave|Imma|Dive)\b)\w+[ \t]+)*(?:Dave()|Imma()|Dive())[ \t]*){3}$\1\2\3
I use the word "trick" advisedly. :) This proves that a regex can do the job, but I wouldn't expect to see this regex in any serious application. You'd be much better off writing a method for this purpose.
(By the way, if duplicates are allowed, just remove the $
.)
EDIT: Another question: should the names be matched only in the form of complete words? In other words, should these lines match?
DaveCar PomDive Imma
DaveImmaDive
So far, the only other answer that enforces both uniqueness and complete words is Coronatus's, and it fails to match lines with extra words, like these:
Dave Imma Car Pom Dive
Pyro Dave Imma Dive
精彩评论