Regex for matching a set of words
Is there a way to match a set of words in a sentence?
The requirement is I wou开发者_如何学编程ld like to check whether a sentence contains the following words po
or p.o
or p.o
or box
. But it shouldn't catch post or sandbox.
po --> error
post --> success
box --> error
hippo --> succes
Thanks in advance
This ought to do it:
/\b(p\.?o|box)\b/g
- the first
\b
matches the beginning of a word group - the
( . . . )
sets a matching group - the
p\.?o
is the first pattern, that matches a "p" and an "o" with an optional period (".") after the "p" - the "|" says to match the first pattern or the second pattern
- the
box
is the second pattern, that matches just the word "box" :) - the second
\b
matches the end of a word group g
makes the pattern "greedy" so it will match all occurrences of the pattern
If you would like it to be case insensitive, include the "i" parameter at the end of the pattern:
/\b(p\.?o|box)\b/gi <--- right here
Edit: to simplify the pattern, I removed the \.?
that came after the "o". Since the "." would have to be the last character in the pattern, there is no difference in matching "p.o." and "p.o" . . . if the next character after "p.o" is a period, or a space, it should match. If it is a letter (for example) it shouldn't, but the presence of the trailing "." is really irrelevant to the check.
Use \b to catch word boundaries.
The regex fragment:
\b(po|p\.o)\b
will only match if a sentence contains the word po or the word p.o.
A function that will return if a sentence contains any combination of the words "po", "p.o.", or "box" (I've included the capitalizations of these as well): jsfiddle
function containsPObox(sentence) {
var matches = sentence.match(/\bp\.?o\b\.?|\b(box)\b/gi);
return (matches && matches.length > 0)?true:false;
}
Regarding the regex: /\bp\.?o\b\.?|\b(box)\b/gi
Breaking it down...
\b
-> word boundary (first letter following a space or last letter before a space or period)
p
-> 'p'
\.?
-> optional '.'
o
-> 'o'
\b
-> word boundary
\.?
-> optional '.'
|
-> "or"
\b
-> word boundary
(box)
-> 'box'
\b
-> word boundary
/g
-> anywhere in the sentence
i
-> case insensitive
Like sh54's response, but without the | block.
\b(p\.?o\.?)\b
This will math po, p.o., box, and any combination that includes box.
精彩评论