Cleaning up nasty characters in PHP
Got a little issue where my client is pasting in content from Word into my little text editor in a CMS.
The double quotes are coming back encoded in what looks like some form of UTF.
Any ideas if I can strip/replace these using PHP when they get displayed out of my mySQL table.
Here is the link to the page that spits out the 开发者_如何转开发dodgy characters, you can see the 'black diamonds of doom' which are causing the headaches.
http://linq.milkbarstudios.com/news_detail.php?id=3
Any suggestions would be greatly accepted!
This sounds like a bug in your code. When handling text data, you must always consider the encoding and convert back and forth as necessary. So when the browser sends you UTF-8, you must decode the string before you send it to the database (MySQL does support UTF-8 in text columns). That way, the original text will be preserved. Of course, you must do the same when you render the page for the browser (set the charset to UTF-8, make sure you actually send UTF-8, etc).
I was actually looking for PHP to replace the dodgy characters.
in the end I found this, which fixes it perfectly:
$output = preg_replace('/[^(\x20-\x7F)]*/','', $output);
精彩评论