开发者

Remove garbage characters in arabic

I needed to remove all non Arabic characters from a string and eventually wi开发者_运维技巧th the help of people from stack-overflow was able to come up with the following regex to get rid of all characters which are not Arabic.

preg_replace('/[^\x{0600}-\x{06FF}]/u','',$string);

The problem is the above removes white spaces too. And now I discovered I would need character from A-Z,a-z,0-9, !@#$%^&*() also. So how do I need to modify the regex?

Thanking you


Add the ones you want to keep to your character class:

preg_replace('/[^\x{0600}-\x{06FF}A-Za-z !@#$%^&*()]/u','', $string);


assume you have this string:

$str = "Arabic Text نص عربي test 123 و,.m,............ ~~~ ٍ،]ٍْ}~ِ]ٍ}";

this will keep arabic chars with spaces only.

echo preg_replace('/[^أ-ي ]/ui', '', $str);

this will keep Arabic and English chars with Numbers Only

echo preg_replace('/[^أ-يA-Za-z0-9 ]/ui', '', $str);

this will answer your question latterly.

echo preg_replace('/[^أ-يA-Za-z !@#$%^&*()]/ui', '', $str);


In a more detailed manner from Above example, Considering below is your string:

$string = '<div>This..</div> <a>is<a/> <strong>hello</strong> <i>world</i> ! هذا هو مرحبا العالم! !@#$%^&&**(*)<>?:";p[]"/.,\|`~1@#$%^&^&*(()908978867564564534423412313`1`` "Arabic Text نص عربي test 123 و,.m,............ ~~~ ٍ،]ٍْ}~ِ]ٍ}"; ';

Code:

echo preg_replace('/[^\x{0600}-\x{06FF}A-Za-z0-9 !@#$%^&*().]/u','', strip_tags($string));

Allows: English letters, Arabic letters, 0 to 9 and characters !@#$%^&*().

Removes: All html tags, and special characters other than above

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜