Another tricky preg_match
Just need to see if a paragraph contains a "stop word", the stop words are in an array below.
I had the formula as:
$pattern_array = array("preheat", "minutes", "stir", "heat", "put", "beat", "bowl", "pan");
foreach ($pattern_array as $pattern) {
if (preg_match('/'.$pattern开发者_运维知识库.')/i', $paragraph)) {
$stopwords = 1;
}
}
Which works well enough but for short words like 'pan' a word like 'panko' is identified as a stop word.
So the regex would be something like it has to have a space before it or be the start of a new line and either end in a full stop/space/comma/(other non character objects).
Also how could I tell php to exit the loop as soon as a stop word is identified?
Thanks guys, slowing learning regex as I go!
Use \b(preheat|minutes|stir|heat|put|bowl|pan)\b
as your regex. That way, you only need one regex (no looping necessary), and by using the \b
word boundary assertions, you make sure that only entire words match.
Haven't tried this, but \b
should be the character group you're looking for. From the PHP manual:
\b word boundary
Your code would then look something like this:
$pattern_array = array("preheat", "minutes", "stir", "heat", "put", "beat", "bowl", "pan");
foreach ($pattern_array as $pattern) {
if (preg_match('/\b'.$pattern.'\b/i', $paragraph)) { // also removed the ')'
$stopwords = 1;
break; // to exit the loop
}
}
Edit: seems people are better off using \b, so changed this accordingly
you need to add \b
(which stands for word boundary) to your regex like this:
'/\b'.$pattern.'\b/i'
You seem to have a typo in your code, because either you have a literal closing bracket (and don't match parts of the words) or you have an open closing bracket.
1. You can use "\b" to check for word boundaries. A word boundary is defined as the boundary between a word character and a non-word character. word-characters are letters, numbers, and underscore.
2. You can do it all at one go, by using "|":
$stopwords = preg_match('/\\b(preheat|minutes|stir|heat|..other words..|pan)\\b/i', $paragraph)
精彩评论