开发者

How can I check a string with regex until I find a whitespace?

I have this regex with the preg_replace function on PHP :

$str=preg_replace(
    '#\b((Hello ).+)#',
    '<a class="lforum" href="$1">$1</a>',
    $str);

It checks all strings that start with Hello and are followed by any kind of chars (at least one char, with repetition).

So for example :

Hello Mark \\ is checked
HelloMark  \\ is not checked

The problem now is that also this string is checked :

Hello Mark Cordi

because white space is a char, anyway.

I don't want this. Or, better, if the string is Hello Mark Cordi, it must replace only Hell开发者_如何学Pythono Mark.

How can I do this? Thanks

EDIT Problem with newline

My actual function :

echo example(htmlentities($myString, ENT_QUOTES, "UTF-8"));

function example($str) {
    $str=preg_replace(
        '#((Hello )[^ \n]+)#',
        '<a class="lforum" href="$1">$1</a>',
        $str);

    return nl2br($str);     
}

If $myString is :

Hello Mario
Ciao

(notice the newline, so at the end of Hello Mario there is a \n) the output is this :

<a class="lforum" href="Hello Mario<br />">Hello Mario<br /></a><br />Ciao

instead of :

<a class="lforum" href="Hello Mario">Hello Mario</a><br />Ciao

So it add that \n with $1 on replace, and it shouldnt :(


Replace only word characters, using \w instead of .:

$str=preg_replace(
    '#\b((Hello )\w+)#',
    '<a class="lforum" href="$1">$1</a>',
    $str);

Word characters are:

  • A-Za-z
  • 0-9
  • _

This is probably what you actually want, rather than just excluding white space.


Use [^ ] (everything but a whitespace) instead of ..

[^abc] means "everything but a, b and c". Here we use it with a single whitspace.

Edit (2):

This is working:

  $str=preg_replace(
        '#(Hello [^\s\n<]+)#',
        '<a class="lforum" href="$1">$1</a>',
        $str);

It is ok for strings like this: Mark<..., Mark\n..., Mark ... (replace ... with what you want).

[^\s\n<] means "everything but spaces (\s), newlines (\n), and <".


Based on your question, edit section and various comments I believe following code should work fine for you:

$str = "Hello Mario
Ciao";
var_dump(example(htmlentities($str, ENT_QUOTES, "UTF-8")));
function example($str) {
    $s=preg_replace(
        '~(Hello\W+[^\W]+)~s',
        '<a class="lforum" href="$1">$1</a>',
        $str);
    return nl2br($s);
}

OUTPUT

string(52) "<a class="lforum" href="Hello Mario">Hello Mario</a>"

Important is to use s modifier with regex to match newlines as well and \W for matching whitespace + newline.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜