开发者

Matching pattern with dot in PHP

I need to find and replace substring with 开发者_开发技巧dot in it. It's important to keep search strict to word boundaries (\b). Here's an example script to reproduce (i need to match "test."):

<?php
# 1.php
$string = 'test. lorem ipsum';
if(!preg_match('~\btest\.\b~i', $string)) echo 'no match 1' . PHP_EOL;
if(!preg_match('~\btest\b\.~i', $string)) echo 'no match 2' . PHP_EOL;

And here's output:

x:\>php 1.php
no match 1

x:\>php -v
PHP 5.2.8 (cli) (built: Dec  8 2008 19:31:23)
Copyright (c) 1997-2008 The PHP Group

BTW, I also don't get any match if there're square brackets in search pattern. I do escape them of course, but still no effect.


Regexes can't read; they don't really know what a "word" is. To them, a word boundary is simply a position that is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one:

(?<=\w)(?!\w)|(?=\w)(?<!\w)

So the position after the . in first first test would only be a word boundary if it were followed by another word character ([A-Za-z0-9_]; in some regex flavors the definition is based on a broader range of characters, including accented English letters and letters from other scripts, but in PHP it's only ASCII letters and digits).

I suspect what you want to do is make sure the . is either followed by whitespace, or it's at the end of the string. You can express that directly as a positive lookahead:

'~\btest\.(?=\s|$)~i'

...or more succinctly, as a negative lookahead:

'~\btest\.(?!\S)~i'

...in other words, if there's a next character, it's not a non-whitespace character.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜