Parsing JSON with Python
I'm getting an error while parsing a JSON response in Python. Ex:
{
"oneliners": [
"she\开发者_如何转开发's the one",
"who opened the gates"
]
}
The JSON decoder coughs up on the invalid escape on the single quote. Typically do people apply a REGEX to remove the escape slash character prior to decoding a response that can potentially contain an invalid escape?
Pyparsing ships with a JSON parsing example (or you can get it online here):
>>> text = r"""{
... "oneliners": [
... "she\'s the one",
... "who opened the gates"
... ]
... } """
>>> text
'{ \n "oneliners": [ \n "she\\\'s the one", \n "who opened the gates" \n ] \n} '
>>> obj = jsonObject.parseString(text)
>>> obj.asList()
[['oneliners', ["she\\'s the one", 'who opened the gates']]]
>>> obj.asDict()
{'oneliners': (["she\\'s the one", 'who opened the gates'], {})}
>>> obj.oneliners
(["she\\'s the one", 'who opened the gates'], {})
>>> obj.oneliners.asList()
["she\\'s the one", 'who opened the gates']
Don't be put off by the seeming inclusion of a dict (the '{}') in obj.oneliners
, that is just the repr output for a pyparsing ParseResults object. You can just treat obj.oneliners like an ordinary list - or if you like, extract its contents as a list using asList
as shown.
if you have \'
character sequence in your JSON string representation, and you KNOW it should be '
, it means it was improperly escaped before, you should fix the problem there.
if you can't, you should do the replacement before you provide such a string to the JSON parser. simplejson
will fail parsing it, cjson
or anyjson
would not fail, but will handle it literally, so you will have the backslash-apostrophe sequence in the resulting data.
import json
s = """{
"oneliners": [
"she\'s the one",
"who opened the gates"
]
}"""
print "%r" % json.loads(s)
This appears to work just fine, in Python 2.6 and upwards anyway.
精彩评论