开发者

scanString end location: why it is end_index+1?

python/pyparsing

When I use scanString method, it is giving the start and end location of the matched token, in the text.

e.g.

line = "cat bat"
pat = Word(alphas)
for i in pat.scanString(line):
    print i

I get the following:

((开发者_JS百科['cat'], {}), 0, 3)
((['bat'], {}), 4, 7)

But cat end location should be "2" right? Why it is reporting the next location as the end location?


This is consistent with Python's [begin:end] slicing conventions, where the "end" is the index of the next character. By putting the end as the next location, it is very straightforward to extract the matching substring using the returned values:

for t,start,end in pat.scanString(line):
    print line[start:end]

You can see how this is used if you look in the pyparsing source code for the implementation of transformString.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜