.net regex meaning of [^\\.]+
I have a question about a regex. Given this part of a regex:
(.[^\\.]+)
The part [^\.]+
Does this mean get everything until the first dot? So with this text:
Hello my name is Martijn. I live in Holland.
I get 2 results: both sentences开发者_C百科. But when I leave the + sign, I get 2 two characters: he
, ll
, o<space>
, my
, etc. Why is that?
Your regex .[^\\.]+
means:
- Match any character
- Match any character until you get slash or a dot ".". Note that
[^\\.]
means NOT slash or NOT dot, which means either a dot or a slash is not a match. It will keep on matching characters until it founds a dot or slash because of the "+" at the end. It is called a greedy quantifier because of that.
When you input (quotes not included): "Hello my name is Martijn. I live in Holland." The matches are:
- Hello my name is Martijn
- . I live in Holland
Note that the dot is not included in the first match since it stops at n in Martijn and the second match starts with the dot.
When you remove the +: (.[^\\.]
)
It just means:
- Match any character
- Match any character except a dot or a slash.
Because a dot outside a character class (ie, not between []) means (almost) any character.
So, .[^\\.]
means match (almost) any character followed by something which is not a dot nor a backslash (dots don't need to be escaped in a character class to mean just a dot, but backslashes do),
This, in your example, is h (any character) e (not a dot nor a backslash) and so on and so forth.
Whereas with a + (one or more of not a dot nor a backslash) you will match all characters which are not dots until a dot.
The regex means: any one character followed by more than zero characters that are not a backslash or a period.
精彩评论