Python's cPickle deserialization from PHP?
I have to deserialize a dictionary in PHP that was serialized using cPickle in Python.
In this specific case I probably could 开发者_如何转开发just regexp the wanted information, but is there a better way? Any extensions for PHP that would allow me to deserialize more natively the whole dictionary?
Apparently it is serialized in Python like this:
import cPickle as pickle
data = { 'user_id' : 5 }
pickled = pickle.dumps(data)
print pickled
Contents of such serialization cannot be pasted easily to here, because it contains binary data.
If you want to share data objects between programs written in different languages, it might be easier to serialize/deserialize using something like JSON instead. Most major programming languages have a JSON library.
Can you do a system call? You could use a python script like this to convert the pickle data into json:
# pickle2json.py
import sys, optparse, cPickle, os
try:
import json
except:
import simplejson as json
# Setup the arguments this script can accept from the command line
parser = optparse.OptionParser()
parser.add_option('-p','--pickled_data_path',dest="pickled_data_path",type="string",help="Path to the file containing pickled data.")
parser.add_option('-j','--json_data_path',dest="json_data_path",type="string",help="Path to where the json data should be saved.")
opts,args=parser.parse_args()
# Load in the pickled data from either a file or the standard input stream
if opts.pickled_data_path:
unpickled_data = cPickle.loads(open(opts.pickled_data_path).read())
else:
unpickled_data = cPickle.loads(sys.stdin.read())
# Output the json version of the data either to another file or to the standard output
if opts.json_data_path:
open(opts.json_data_path, 'w').write(json.dumps(unpickled_data))
else:
print json.dumps(unpickled_data)
This way, if your getting the data from a file you could do something like this:
<?php
exec("python pickle2json.py -p pickled_data.txt", $json_data = array());
?>
or if you want to save it out to a file this:
<?php
system("python pickle2json.py -p pickled_data.txt -j p_to_j.json");
?>
All the code above probably isn't perfect (I'm not a PHP developer), but would something like this work for you?
I know this is ancient, but I've just needed to do this for a Django 1.3 app (circa 2012) and found this:
https://github.com/terryf/Phpickle
So just in case, one day, someone else needs the same solution.
If the pickle is being created by the the code that you showed, then it won't contain binary data -- unless you are calling newlines "binary data". See the Python docs. Following code was run by Python 2.6.
>>> import cPickle
>>> data = {'user_id': 5}
>>> for protocol in (0, 1, 2): # protocol 0 is the default
... print protocol, repr(cPickle.dumps(data, protocol))
...
0 "(dp1\nS'user_id'\np2\nI5\ns."
1 '}q\x01U\x07user_idq\x02K\x05s.'
2 '\x80\x02}q\x01U\x07user_idq\x02K\x05s.'
>>>
Which of the above looks most like what you are seeing? Can you post the pickled file contents as displayed by a hex editor/dumper or whatever is the PHP equivalent of Python's repr()? How many items in a typical dictionary? What data types other than "integer" and "string of 8-bit bytes" (what encoding?)?
精彩评论