Regex find the first word
I'm trying to use regex to add a span to the first word of content for a page, however the content contains HTML so I am trying to ensure just a word gets chosen. The content changes for every page.
Current script is:
preg_match('/(<(.*?)>)*/i',$page_content,$matches);
$stripped = substr($page_content,strlen($matches[0]));
preg_match('/\b[a-z]* \b/i',$stripped,$strippedmatch);
echo substr($page_content, 0, strlen($matches[0])).'<span class="h1">'.$strippedmatch[0].'</span>'.substr($stripped, strlen($strippedmatch[0]));
However if the $page_content is
<p><span class="title">This is </span> my t开发者_StackOverflow社区itle!</p>
Then my regex thinks the first word is "span" and adds the tags around that.
Is there any way to fix this? (or a better way to do it).
This seems to work...
(?<=\>)\b\w*\b|^\w*\b
If you wanna allow spaces in front also (remember to trim the resulting string):
(?<=>)\s*\b\w*\b|^\s*\w*\b
If i understand you correct you want a tag around the first word (none tag) with regex you could get that by using this regex
$code = preg_replace('/^(<.+?>\s*)+?(\w+)/i', '\1<span class="h1">\2</span>', $code);
this one just loops over the tags and waits until it finds text outside the tags
You shouldn't be using regex for this, but if you insist, you can try something like this:
<?php
$texts = array(
'<p><span class="title">This is </span> my title!</p>',
'<1> <2> <3> blah blah <4> <5> blah',
'garbage <1> <2> real stuff begins <3> <4>',
);
foreach ($texts as $text) {
print preg_replace('/(>\s*)(\w+)/', '\1{{\2}}', $text, 1)."\n";
}
?>
This prints:
<p><span class="title">{{This}} is </span> my title!</p>
<1> <2> <3> {{blah}} blah <4> <5> blah
garbage <1> <2> {{real}} stuff begins <3> <4>
精彩评论