PHP regex for a word collection around a search phrase
Hi I am trying to create a regex that will do the following
grab 5 words before the search phrase (or x if there is only x words there) and 5 words after the search phrase (or x if there is only x words there) from a block of text (when I say words I mean words or numbers whatever is in the block of text)
eg
Welcome to Stack Overflow! Visit your user page to set your name and email.
if you was to search "visit" it would return: Welcome to Stack Overflow! Visit your user page to set
the idea is to use preg_match_all in php to give me a bunch of search results showing where in the text the search phrase appears for each occurrence of the search phrase.
Thanks in advance :D
on a sub note there may be a better way to get to my result if you feel there is pleas开发者_Python百科e feel free to throw it in the pool as I'm not sure this is the best just the first way I thought of, to do what I need :D
How about this:
(\S+\s+){0,5}\S*\bvisit\b\S*(\s+\S+){0,5}
will match five "words" (but accepting less if the text is shorter) before and after your search word (in this case visit
).
preg_match_all(
'/(\S+\s+){0,5} # Match five (or less) "words"
\S* # Match (if present) punctuation before the search term
\b # Assert position at the start of a word
visit # Match the search term
\b # Assert position at the end of a word
\S* # Match (if present) punctuation after the search term
(\s+\S+){0,5} # Match five (or less) "words"
/ix',
$subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
I'm defining a "word" as a sequence of non-whitespace characters, separated by at least one whitespace.
The search words should be actual words (starting and ending with an alphanumeric character).
You can do the folowing (it is a bit computation heavy, so it woudn't be efficient for very long strings):
<?php
$phrase = "Welcome to Stack Overflow! Visit your user page to set your name and email.";
$keyword = "Visit";
$lcWords = preg_split("/\s/", strtolower($phrase));
$words = preg_split("/\s/", $phrase);
$wordCount = 5;
$position = array_search(strtolower($keyword), $lcWords);
$indexBegin = max(array($position - $wordCount, 0));
$len = min(array(count($words), $position - $indexBegin + $wordCount + 1));
echo join(" ", array_slice($words, $indexBegin, $len));
//prints: Welcome to Stack Overflow! Visit your user page to set
Codepad example here
精彩评论