开发者

Regular expression to return text between parenthesis

u'abcde(date=\'2/xc2/xb2\',time=\'/case/te开发者_StackOverflow社区st.png\')'

All I need is the contents inside the parenthesis.


If your problem is really just this simple, you don't need regex:

s[s.find("(")+1:s.find(")")]


Use re.search(r'\((.*?)\)',s).group(1):

>>> import re
>>> s = u'abcde(date=\'2/xc2/xb2\',time=\'/case/test.png\')'
>>> re.search(r'\((.*?)\)',s).group(1)
u"date='2/xc2/xb2',time='/case/test.png'"


If you want to find all occurences:

>>> re.findall('\(.*?\)',s)
[u"(date='2/xc2/xb2',time='/case/test.png')", u'(eee)']

>>> re.findall('\((.*?)\)',s)
[u"date='2/xc2/xb2',time='/case/test.png'", u'eee']


Building on tkerwin's answer, if you happen to have nested parentheses like in

st = "sum((a+b)/(c+d))"

his answer will not work if you need to take everything between the first opening parenthesis and the last closing parenthesis to get (a+b)/(c+d), because find searches from the left of the string, and would stop at the first closing parenthesis.

To fix that, you need to use rfind for the second part of the operation, so it would become

st[st.find("(")+1:st.rfind(")")]


import re

fancy = u'abcde(date=\'2/xc2/xb2\',time=\'/case/test.png\')'

print re.compile( "\((.*)\)" ).search( fancy ).group( 1 )


contents_re = re.match(r'[^\(]*\((?P<contents>[^\(]+)\)', data)
if contents_re:
    print(contents_re.groupdict()['contents'])


No need to use regex .... Just use list slicing ...

string="(tidtkdgkxkxlgxlhxl) ¥£%#_¥#_¥#_¥#"
print(string[string.find("(")+1:string.find(")")])


TheSoulkiller's answer is great. just in my case, I needed to handle extra parentheses and only extract the word inside the parentheses. a very small change would solve the problem

>>> s=u'abcde((((a+b))))-((a*b))'
>>> re.findall('\((.*?)\)',s)
['(((a+b', '(a*b']
>>> re.findall('\(+(.*?)\)',s)
['a+b', 'a*b']


Here are several ways to extract strings between parentheses in Pandas with the \(([^()]+)\) regex (see its online demo) that matches

  • \( - a ( char
  • ([^()]+) - then captures into Group 1 any one or more chars other than ( and )
  • \) - a ) char.

Extracting the first occurrence using Series.str.extract:

import pandas as pd
df = pd.DataFrame({'Description':['some text (value 1) and (value 2)']})
df['Values'] = df['Description'].str.extract(r'\(([^()]+)\)')
# => df['Values']
#    0    value 1
#    Name: Values, dtype: object

Extracting (finding) all occurrences using Series.str.findall:

import pandas as pd
df = pd.DataFrame({'Description':['some text (value 1) and (value 2)']})
df['Values'] = df['Description'].str.findall(r'\(([^()]+)\)')
# => df['Values']
#    0    [value 1, value 2]
#    Name: Values, dtype: object

df['Values'] = df['Description'].str.findall(r'\(([^()]+)\)').str.join(', ')
# => df['Values']
#    0    value 1, value 2
#    Name: Values, dtype: object

Note that .str.join(', ') is used to create a comma-separated string out of the resulting list of strings. You may adjust this separator for your scenario.


testcase

s = "(rein<unint>(pBuf) +fsizeof(LOG_RECH))"

result

['pBuf', 'LOG_RECH', 'rein<unint>(pBuf) +fsizeof(LOG_RECH)']

implement

def getParenthesesList(s):
    res = list()
    left = list()
    for i in range(len(s)):
        if s[i] == '(':
            left.append(i)
        if s[i] == ')':
            le = left.pop()
            res.append(s[le + 1:i])
    print(res)
    return res


If im not missing something, a small fix to @tkerwin: s[s.find("(")+1:s.rfind(")")]

The 2nd find should be rfind so you start search from end of string

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜