开发者

Regular Expressions Question

I have this program:

        Dim words() As String = {"car", "arc", "caar"}

        For Each w In words
            Dim rx = Regex.IsMatch("rca", "^[" + w + "]+$")
            Console.WriteLine(rx)
        Next

        Console.ReadLine()

This Regex "^[" + w + "]+$" finds all w开发者_高级运维ords which consists of letters "rca". This matches for all words, because all words are made up from "rca". Is there something I could add, to return False for "caar", because "rca" has only one "a", but "caar" has two "a"?


You can do it e.g. with the following regular expression:

"(?=^[^r]*r[^r]*$)(?=^[^c]*c[^c]*$)(?=^[^a]*a[^a]*$)^[rca]+$"

It matches any word consisting of letters "rca" but each one at exactly once.

Addon: if the condition is "at most once" you can instead use

"(?=^[^r]*r?[^r]*$)(?=^[^c]*c?[^c]*$)(?=^[^a]*a?[^a]*$)^[rca]+$"


This Regex "^[" + w + "]+$" finds all words which consists of letters "rca"

No, it does not. It searches the string "rca" for all the letters that happen to be in w

What you mean (given that your plain English explanation reflects what you want) is:

Dim rx = Regex.IsMatch(w, "^[rca]+$")

*You could change + to {3}, but this would still match aaa.

To match any permutation of three letters, you would have to add permutations yourself. Regex can't do this for you.

Dim rx = Regex.IsMatch(w, "^rca|rac|acr|arc|car|cra$")


You would have to do it separately, outside the regex. The [...] construct always treats repeated characters as if they were entered once. You could do something like this right before the Console.WriteLine(rx) (note: writing in C# because I'm not very current in VB):

foreach (var ch in w.ToCharArray())
{
    if (w.Count(c => c == ch) != "rca".Count(c => c == ch))
        return false;
}


Currently, it looks like you're using the words as the pattern to search for, perhaps you mean:

Dim rx = Regex.IsMatch(w, "^[rca]+$")

In order to filter out words that contain only one of the letters in your pattern (rca), you might try:

^[^rca]*[rca][^rca]*$

This will match

"anything not r, c or a" zero or many times; 
"r, c or a"; 
"anything not r, c or a" zero or many times;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜