开发者

Regex for matching a set of words

Is there a way to match a set of words in a sentence?

The requirement is I wou开发者_如何学编程ld like to check whether a sentence contains the following words po or p.o or p.o or box. But it shouldn't catch post or sandbox.

po --> error

post --> success

box --> error

hippo --> succes

Thanks in advance


This ought to do it:

/\b(p\.?o|box)\b/g
  • the first \b matches the beginning of a word group
  • the ( . . . ) sets a matching group
  • the p\.?o is the first pattern, that matches a "p" and an "o" with an optional period (".") after the "p"
  • the "|" says to match the first pattern or the second pattern
  • the box is the second pattern, that matches just the word "box" :)
  • the second \b matches the end of a word group
  • g makes the pattern "greedy" so it will match all occurrences of the pattern

If you would like it to be case insensitive, include the "i" parameter at the end of the pattern:

/\b(p\.?o|box)\b/gi  <--- right here

Edit: to simplify the pattern, I removed the \.? that came after the "o". Since the "." would have to be the last character in the pattern, there is no difference in matching "p.o." and "p.o" . . . if the next character after "p.o" is a period, or a space, it should match. If it is a letter (for example) it shouldn't, but the presence of the trailing "." is really irrelevant to the check.


Use \b to catch word boundaries.

The regex fragment:

\b(po|p\.o)\b

will only match if a sentence contains the word po or the word p.o.


A function that will return if a sentence contains any combination of the words "po", "p.o.", or "box" (I've included the capitalizations of these as well): jsfiddle

function containsPObox(sentence) {
    var matches = sentence.match(/\bp\.?o\b\.?|\b(box)\b/gi);
    return (matches && matches.length > 0)?true:false;
}

Regarding the regex: /\bp\.?o\b\.?|\b(box)\b/gi

Breaking it down...

\b -> word boundary (first letter following a space or last letter before a space or period)

p -> 'p'

\.? -> optional '.'

o -> 'o'

\b -> word boundary

\.? -> optional '.'

| -> "or"

\b -> word boundary

(box) -> 'box'

\b -> word boundary

/g -> anywhere in the sentence

i -> case insensitive


Like sh54's response, but without the | block.

\b(p\.?o\.?)\b

This will math po, p.o., box, and any combination that includes box.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜