开发者

PHP: json_encode vs serialize for storing in a MySQL database?

I'm storing some "unstructured" data (a keyed array) in one field of my table, and i'm currently using serialize() / unserialize() to "convert" back and forth from array to string.

Every now and then, however, I get errors when unserializing the data. I believe these errors happen because of Unicode data in the strings inside the array i'm serializing, although there are some records with Unicode data that work just fine. (DB field is UTF-8)

I'm wondering whether using json_encode instead of serialize will make a difference / make this more resilient. This is not trivial for me to test, since in my dev environment everything works well, but in production, every now and then (about 1% of records) I get an error.

Btw, I know i'm weaseling out of finding an actual explanation for the problem and just blindly trying something, I'm kind of hoping I can get rid of this without spending too much time on it.

Do you think using json_encode instead of serialize will make 开发者_运维百科this more resilient to "serialization errors"? The data format does look more "forgiving" to me...

UPDATE: The actual error i'm getting is:

 Notice: unserialize(): Error at offset 401 of 569 bytes in C:\blah.php on line 20

Thanks! Daniel


JSON has one main advantage :

  • compatibility with other languages than PHP.

PHP's serialize has one main advantage :

  • it's specifically designed to store PHP-based data -- most notably, it can store serialized objects, instance of classes, that will be re-instanciated to the right class-type when the string is unserialized.

(Yes, those advantages are the exact opposite of each other)


In your case, as you are storing data that's not really structured, both formats should work pretty well.

And the encoding problem you have should not be related to serialize by itself : as long as everything (DB, connection to the DB, PHP files, ...) is in UTF-8, serialization should work too.


I think unless you absolutely need to preserve php specific types that json_encode() is the way to go for storing structured data in a single field in MySQL. Here's why:

https://dev.mysql.com/doc/refman/5.7/en/json.html

As of MySQL 5.7.8, MySQL supports a native JSON data type defined by RFC 7159 that enables efficient access to data in JSON (JavaScript Object Notation) documents

If you are using a version of MySQL that supports the new JSON data type you can benefit from that feature.

Another important point of consideration is the ability to perform changes on those JSON strings. Suppose you have a url stored in encoded strings all over your database. Wordpress users who've ever tried to migrate an existing database to a new domain name may sympathize here. If it's serialized, it's going to break things. If it's JSON you can simply run a query using REPLACE() and everything will be fine. Example:

$arr = ['url' => 'http://example.com'];
$ser = serialize($arr);
$jsn = json_encode($arr);

$ser = str_replace('http://','https://',$ser);
$jsn = str_replace('http://','https://',$jsn);

print_r(unserialize($ser));
PHP Notice:  unserialize(): Error at offset 39 of 43 bytes in /root/sandbox/encoding.php on line 10
print_r(json_decode($jsn,true));

Array ( [url] => https://example.com )


json_encode() converts non-ASCII characters and symbols (e.g., “Schrödinger” becomes “Schr\u00f6dinger”) but serialize() does not.

Source: https://www.toptal.com/php/10-most-common-mistakes-php-programmers-make#common-mistake-6--ignoring-unicodeutf-8-issues


To leave UTF-8 characters untouched, you can use the option JSON_UNESCAPED_UNICODE as of PHP 5.4.

Source: https://stackoverflow.com/a/804089/1438029


If the problem is (and I believe it is) in UTF-8 encoding, there is not difference between json_encode and serialize. Both will leave characters encoding unchanged.

You should make sure your database/connection is properly set up for handle all UTF-8 characters or encode whole record into supported encoding before inserting to the DB.

Also please specify what "I get an error" means.


Found this in the PHP docs...

function mb_unserialize($serial_str) { 
    $out = preg_replace('!s:(\d+):"(.*?)";!se', "'s:'.strlen('$2').':\"$2\";'", $serial_str ); 
    return unserialize($out); 
} 

I don't quite understand it, but it worked to unserialize the data that I couldn't unserialize before. Moved to JSON now, i'll report in a couple of weeks whether this solved the problem of randomly getting some records "corrupted"


As I'm going through this I'll give my opinion, both serialize and json_encode are good for storing data in DB, but for those looking for performance, I've tested and I get these results, json_encode are a little microsegunds faster tham serialize, i used this script to calculate a the difference time.

$bounced =array();
for($i=count($bounced); $i<9999; ++$i)$bounced[$i]=$i;


$timeStart = microtime(true);
var_dump(serialize ($bounced));
unserialize(serialize ($bounced));
print timer_diff($timeStart) . " sec.\n";
$timeStart = microtime(true);
var_dump(json_encode ($bounced));
json_decode(json_encode ($bounced));
print timer_diff($timeStart) . " sec.\n";

function timer_diff($timeStart)
{
    return number_format(microtime(true) - $timeStart, 3);
}


As a design decision, I'd opt for storing JSON because it can only represent a data structure, whereas serialization is bound to a PHP data object signature.

The advantages I see are: * you are forced to separate the data storage from any logic layer on top. * you are independent from changes to the data object class (say, for example, that you want to add a field).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜