Matching string that isn't wrapped with certain characters
I am trying to make automatic tag detection and convertion into hyperlinks. Problem is, that it has to be done after the string is run through the following:
htmlspecialchars($string, ENT_QUOTES, "UTF-8");
Now, i.e., the '
symbol is turned into '
. The tags are in the form of #[a-Z0-9\-\_]
So, the script considers the encoded special characters as tags because of the #39
part.
How do I match with preg_match
so, that it would not consider开发者_如何学C #
marks preceded with &
mark as tags?
Thank you!
You have to use a lookbehind assertion to check that the string is not preceded by a &
:
Try with this:
"/(?<!&)#[\w-]+/"
The (?<!&)
cause the #
to match only if it is not preceded by &
.
The \w
part matches [a-zA-Z0-9_]
You may also want to check if the tag is preceded by a whitespace or is a the start of the string:
"/(:?^|\s)#[\w-]+/"
Use a Look Behind assertion
(?<!a)b
matches a "b" that is not preceded by an "a"
In your case, that would be
(?<!&)#[a-Z0-9\-\_]
Will not match #
preceded by &
精彩评论