开发者

How to do REGEX (Groovy) to select words "She","Shell" with REGEX = "She"?

I am newbie on REGEX, I am trying to get the words "She" and "Shell" only, not ashes with th开发者_运维技巧is program (Groovy). I have been working on the for some time.

saying = 'She wishes for Shells not ashes'
println saying
def pattern = ~/\bShe*\b/
def matcher = pattern.matcher(saying)
def count = matcher.getCount()
println "Matches = ${count}"
for (i in 0..<count) {
    print matcher[i] + " "
}

Output: She wishes for Shells not ashes Matches = 1 She

REGEX does not work like Windows CMD e.g dir W* to list folder or files begins with W. What did I do wrong?

Many Thanks when you answer this question


In regular expressions the * is not the same as a wildcard (match any characters).

It is a quantifier that modifies whatever is immediately before it and means "zero or more". Your regular expression matches Sh followed by zero or more e. So it will match these strings:

Sh
She
Shee
Sheee
etc...

What you probably mean is \w* to match any word characters.

/\bShe\w*\b/

Also note that in regular expressions "word characters" are considered to be letters, numbers or the underscore. So a sequence of word characters is different from what is regarded as a "word" in human languages. It is in fact not easy to correctly identify words using regular expressions alone, so if you need to match words in a specific language you should use a natural language processing library and/or a dictionary instead of a regular expression.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜