开发者

Masking all but first letter of a word using Regex

I'm attempting to create a bad word filter in PHP that will analyze the word and match against an array of known bad words, but keep the first letter of the word and replace the rest with asterisks. Example:

fook would become f*** shoot would become s**

The only part I don't know is how to keep the first letter in the string, and how to replace the remaining letters with s开发者_StackOverflowomething else while keeping the same string length.

$string = preg_replace("/\b(". $word .")\b/i", "***", $string);

Thanks!


$string = 'fook would become';
$word = 'fook';

$string = preg_replace("~\b". preg_quote($word, '~') ."\b~i", $word[0] . str_repeat('*', strlen($word) - 1), $string);

var_dump($string);


$string = preg_replace("/\b".$word[0].'('.substr($word, 1).")\b/i", "***", $string);


This can be done in many ways, with very weird auto-generated regexps... But I believe using preg_replace_callback() would end up being more robust

<?php
# as already pointed out, your words *may* need sanitization

foreach($words as $k=>$v)
  $words[$k]=preg_quote($v,'/');

# and to be collapsed into a **big regexpy goodness**
$words=implode('|',$words);


# after that, a single preg_replace_callback() would do

$string = preg_replace_callback('/\b('. $words .')\b/i', "my_beloved_callback", $string);

function my_beloved_callback($m)
{
  $len=strlen($m[1])-1;

  return $m[1][0].str_repeat('*',$len);
}


Here is unicode-friendly regular expression for PHP:

function lowercase_except_first_letter($s) {
    // the following line SKIP the first word and pass it to callback func...
    // \W it allows to keep the first letter even in words in quotes and brackets
    return preg_replace_callback('/(?<!^|\s|\W)(\w)/u', function($m) {
            return mb_strtolower($m[1]);
        }, $s);
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜