开发者

Non greedy parsing with pyparsing

I'm trying to parse a line with pyparsing. This line is composed of a number of (key, values). What I'd like to get is a list of (key, values). A simple example:

ids = 12 fields = name

should result in something like: [('ids', '12'), ('fields', 'name')]

A more complex example:

ids = 12, 13, 14 fields = name, title

should result in something like: [('ids', '12, 13, 14'), ('fields', 'name, title')]

PS: the tuple inside the resulting list is just an example. It could be a dict or another list or whatever, it's not that important.

But whatever I've tried up to now I get results like: [('ids', '12 fields')]

Pyparsing is eating the next key, considering it's also part of the value.

Here is a sample cod开发者_如何转开发e:

import pyparsing as P

key = P.oneOf("ids fields")
equal = P.Literal('=')
key_equal = key + equal
val = ~key_equal + P.Word(P.alphanums+', ')

gr = P.Group(key_equal+val)
print gr.parseString("ids = 12 fields = name")

Can someone help me ? Thanks.


The first problem lies in this line:

val = ~key_equal + P.Word(P.alphanums+', ')

It suggests that the part matches any alphanumeric sequence, followed by the literal ', ', but instead it matches any sequence of alphanumeric characters, ',' and ' '.

What you'd want instead is:

val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)

The second problem is that you only parse one key-value pair:

gr = P.Group(key_equal+val)

Instead, you should parse as many as possible:

gr = P.Group(P.OneOrMore(key_equal+val))

So the correct solution is:

>>> import pyparsing as P
>>> key = P.oneOf("ids fields")
>>> equal = P.Literal('=')
>>> key_equal = key + equal
>>> val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)
>>> gr = P.OneOrMore(P.Group(key_equal+val))
>>> print gr.parseString("ids = 12, 13, 14 fields = name, title")
[['ids', '=', '12, 13, 14'], ['fields', '=', 'name, title']]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜