MySQL blob dump to tab delimited files
I am migrating a MySQL 5.1 database in Amazon's EC2, and I am having issues tables with longblob datatype we use for image storage. Basically, after the migration, the data in the longblob column is a different size, due to the fact that the character encoding seems to be different.
First of all, here is an example of before and after the migration:
Old:
x??]]??}?_ѕ??d??i|w?%?????q$??+?
New:
x��]]����_ѕ��d��i|w�%�����q$��+�
I checked the character set variables on both machines and they are identical. I also checked the 'show create table' and they are identical as well. The client's are both connecting the same way (no SET NAMES, or specifying character sets).
Here is the mysqldump command I used (I tried it without --hex-blob as well):
mysqldump --hex-blob --default-character-set=utf8 --tab=. DB_NAME
Here is how I loaded the data:
mysql DB_NAME --default-character-set=utf8 -e "LOAD DATA INFILE 'EXAMPLE.txt' INTO TABLE EXAMPLE;"
Here are the MySQL character set variables (identical):
Old:
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 开发者_如何学编程 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
New:
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
I'm not sure what else to try to be able to run mysqldump and have the blob data be identical on both machines. Any tips would be greatly appreciated.
The issue seems to be a bug in mysql (http://bugs.mysql.com/bug.php?id=27724). The solution is to not use mysqldump, but to write your own SELECT INTO OUTFILE script for the tables that have blob data. Here is an example:
SELECT
COALESCE(column1, @nullval),
COALESCE(column2, @nullval),
COALESCE(HEX(column3), @nullval),
COALESCE(column4, @nullval),
COALESCE(column5, @nullval)
FROM table
INTO OUTFILE '/mnt/dump/table.txt'
FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
To load the data:
SET NAMES utf8;
LOAD DATA INFILE '/mnt/dump/table.txt'
INTO TABLE table
FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'
(column1, column1, @column1, column1, column1)
SET data = UNHEX(@column1)
This loads the blob data correctly.
精彩评论