开发者

PHP regex for a word collection around a search phrase

Hi I am trying to create a regex that will do the following

grab 5 words before the search phrase (or x if there is only x words there) and 5 words after the search phrase (or x if there is only x words there) from a block of text (when I say words I mean words or numbers whatever is in the block of text)

eg

Welcome to Stack Overflow! Visit your user page to set your name and email.

if you was to search "visit" it would return: Welcome to Stack Overflow! Visit your user page to set

the idea is to use preg_match_all in php to give me a bunch of search results showing where in the text the search phrase appears for each occurrence of the search phrase.

Thanks in advance :D

on a sub note there may be a better way to get to my result if you feel there is pleas开发者_Python百科e feel free to throw it in the pool as I'm not sure this is the best just the first way I thought of, to do what I need :D


How about this:

(\S+\s+){0,5}\S*\bvisit\b\S*(\s+\S+){0,5}

will match five "words" (but accepting less if the text is shorter) before and after your search word (in this case visit).

preg_match_all(
    '/(\S+\s+){0,5} # Match five (or less) "words"
    \S*             # Match (if present) punctuation before the search term
    \b              # Assert position at the start of a word
    visit           # Match the search term
    \b              # Assert position at the end of a word
    \S*             # Match (if present) punctuation after the search term
    (\s+\S+){0,5}   # Match five (or less) "words"
    /ix', 
    $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];

I'm defining a "word" as a sequence of non-whitespace characters, separated by at least one whitespace.

The search words should be actual words (starting and ending with an alphanumeric character).


You can do the folowing (it is a bit computation heavy, so it woudn't be efficient for very long strings):

<?php
$phrase = "Welcome to Stack Overflow! Visit your user page to set your name and email.";
$keyword = "Visit";
$lcWords = preg_split("/\s/", strtolower($phrase));
$words = preg_split("/\s/", $phrase);
$wordCount = 5;

$position = array_search(strtolower($keyword), $lcWords);
$indexBegin =  max(array($position - $wordCount, 0));
$len = min(array(count($words), $position - $indexBegin + $wordCount + 1));
echo join(" ", array_slice($words, $indexBegin, $len));
//prints: Welcome to Stack Overflow! Visit your user page to set

Codepad example here

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜