开发者

What is WordPress doing for content encoding in its MySQL database?

For some convoluted reasons best left behind us, I re开发者_高级运维quire direct access the contents of a WordPress database. I'm using MySQL 5.0.70-r1 on Gentoo with WordPress 2.6, and Perl 5.8.8 ftr.

So, sometimes we get high-order characters in the blog, we have quite a few authors contributing too, for the most part these characters end up in WordPress's database in wp_posts.post_content or wp_postmeta.meta_value, Wordpress is displaying these correctly on its site, but the database stores it using single byte encoding that I can't figure out how to convert to the correct string. Today's example:

The blog shows this, and doesn't even seem to escape any chars in the HTML,

   Hãhãhães  

but the database, when viewed via the MySQL prompt, has,

   Hãhãhães

So clearly this is some kind of double-byte encoding issue, but I don't know how I can correct it. I need to be able to pull that second string from the database (because that's what it gives me) and convert it to the first one, and I need to do so using Perl.

Also, just to help unmuddy any waters, I took these strings and printed out the ASCII codes for each character using Perl's ord() function.

Here is the output of the "wrong" string:

H = 72
à = 195
£ = 163
h = 104
à = 195
£ = 163
h = 104
à = 195
£ = 163
e = 101
s = 115

This is the correct string, that I need to produce in my script:

H = 72
ã = 227
h = 104
ã = 227
h = 104
ã = 227
e = 101
s = 115


How about SET NAMES 'utf8'?


I fixed it... Thanks to those who read and/or tried.

my $dbh = DBI->connect('mysql:etc:etc');
$dbh->{mysql_enable_utf8}++;  #<---- solution

That's all, sigh...

Not sure about the MySQL prompt thing, because I don't really care, but I'm sure it's a similar solution, to make sure MySQL is returning results in double-bytes to its prompt. Though see my comment above, "set character_set_*" variables didn't seem to affect it though.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜