Python - extracting a list of sub strings

2023-01-31 11:23 问答作者：

Ho开发者_开发问答w to extract a list of sub strings based on some patterns in python?

for example,

str = 'this {{is}} a sample {{text}}'.

expected result : a python list which contains 'is' and 'text'

>>> import re
>>> re.findall("{{(.*?)}}", "this {{is}} a sample {{text}}")
['is', 'text']

Assuming "some patterns" means "single words between double {}'s":

import re

re.findall('{{(\w*)}}', string)

Edit: Andrew Clark's answer implements "any sequence of characters at all between double {}'s"

You can use the following:

res = re.findall("{{([^{}]*)}}", a)
print "a python list which contains %s and %s" % (res[0], res[1])

Cheers

A regex-based solution is fine for your example, although I would recommend something more robust for more complicated input.

import re

def match_substrings(s):
    return re.findall(r"{{([^}]*)}}", s)

The regex from inside-out:

[^}] matches anything that's not a '}'
([^}]*) matches any number of non-} characters and groups them
{{([^}]*)}} puts the above inside double-braces

Without the parentheses above, re.findall would return the entire match (i.e. ['{{is}}', '{{text}}']. However, when the regex contains a group, findall will use that, instead.

You could use a regular expression to match anything that occurs between {{ and }}. Will that work for you?

Generally speaking, for tagging certain strings in a large body of text, a suffix tree will be useful.

继续阅读：python regex

Python - extracting a list of sub strings

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？