开发者

Using regexp to search for the string matching multiple words arranged in random

How can I write the regexp to match multiple words in random order?

For example, let's assume the following lines:

Dave Imma Car Pom Dive
Dive Dome Dare
Imma Car Ry开发者_Go百科an
Pyro Dave Imma Dive
Lunar Happy Dave

I want to search the string for the one matching "Dave" "Imma" and "Dive", expecting the 1st and 4th line. Is this possible?


If you insist on doing this with regex, you can use lookahead:

s.matches("(?=.*Dave)(?=.*Imma)(?=.*Dive).*")

Regex is not the most efficient way of doing this, though.


in *nix, you can use awk

if its in order

awk '/Dave.*Imma.*Dive/' file

if its not in order

awk '/Dave/ && /Imma/ && /Dive/' file


if  ((matches "/(Dave|Imma|Dive) (Dave|Imma|Dive) (Dave|Imma|Dive)/")
 && (contains("Dave")) && (contains("Imma")) && (contains("Dive")))
{
    // this will work in 90% of cases.
}

I don't think it's possible to do this exactly, though. Sorry.


String[] lines = fullData.split("\n");
String[] names = {"Dave", "Imma", "Dive"};
ArrayList matches = new ArrayList();

for(int i=0; i<lines.size(); i++){
    for(String name : names){
        // If any of the names in the list isn't found
        // then this line isn't a match
        if(!lines[i].contains(name)){
            continue;
        }
    }
    // If we made it this far, all of the names were found
    matches.add(i);
}
// matches now contains {1, 4}

If you don't need to know where the matches are, it can be simplified to:

String[] lines = fullData.split("\n");
String[] names = {"Dave", "Imma", "Dive"};

for(String line : lines){
    for(String name : names){
        // If any of the names in the list isn't found
        // then this line isn't a match
        if(!line.contains(name)){
            continue;
        }
    }
    // If we made it this far, all of the names were found

    // Do something
}


Should the following lines match?

Dave Imma Dave
Dave Imma Dive Imma

I'm guessing the first one shouldn't because it doesn't contain all three names, but are duplicates okay? If not, this regex does the trick:

^(?:\b(?:(?!(?:Dave|Imma|Dive)\b)\w+[ \t]+)*(?:Dave()|Imma()|Dive())[ \t]*){3}$\1\2\3

I use the word "trick" advisedly. :) This proves that a regex can do the job, but I wouldn't expect to see this regex in any serious application. You'd be much better off writing a method for this purpose.

(By the way, if duplicates are allowed, just remove the $.)

EDIT: Another question: should the names be matched only in the form of complete words? In other words, should these lines match?

DaveCar PomDive Imma
DaveImmaDive

So far, the only other answer that enforces both uniqueness and complete words is Coronatus's, and it fails to match lines with extra words, like these:

Dave Imma Car Pom Dive
Pyro Dave Imma Dive
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜