String not valid UTF-8 (BSON::InvalidStringEncoding) when saving a UTF8 compatible string to MongoDB through Mongoid ORM
I am importing data from a MySQL table into MongoDB using Mongoid for my ORM. I am getting an error when trying to save an email address as a string. The error is:
/Library/Ruby/Gems/1.8/gems/bson-1.2.4/lib/../lib/bson/bson_c.rb:24:in `serialize': String not valid UTF-8 (BSON::InvalidStringEncoding)
from /Library/Ruby/Gems/1.8/gems/bson-1.2.4/lib/../lib/bson/bson_c.rb:24:in `serialize'
From my GUI - this is a screenshot of the table info. You can see it's encoded in UTF8.
Also from my GUI - this is a screen shot of the field in my MySQL table that I am importing
This is what happens when I grab the data from MySQL CLI.
And finally, when I inspect the data in my ruby object, I get something that looks like this:
I'm a bit confused here because regardless my table is in UTF-8 and that funky is apparently valid UTF-8 character开发者_JAVA百科 as a double byte. Anyone know why I'm getting this error?
Try using this helper:
http://snippets.dzone.com/posts/show/4527
It puts a method utf8? on the String. So you can grab the String from mysql and see if it is utf8:
my_string.utf8?
If is not, then you can try change the encoding of your String using other methods like:
my_string.asciify_utf8
my_string.latin1_to_utf8
my_string.cp1252_to_utf8
my_string.utf16le_to_utf8
Maybe this String is saved on mysql in one of these encodings.
精彩评论