Regular expression to return text between parenthesis
u'abcde(date=\'2/xc2/xb2\',time=\'/case/te开发者_StackOverflow社区st.png\')'
All I need is the contents inside the parenthesis.
If your problem is really just this simple, you don't need regex:
s[s.find("(")+1:s.find(")")]
Use re.search(r'\((.*?)\)',s).group(1)
:
>>> import re
>>> s = u'abcde(date=\'2/xc2/xb2\',time=\'/case/test.png\')'
>>> re.search(r'\((.*?)\)',s).group(1)
u"date='2/xc2/xb2',time='/case/test.png'"
If you want to find all occurences:
>>> re.findall('\(.*?\)',s)
[u"(date='2/xc2/xb2',time='/case/test.png')", u'(eee)']
>>> re.findall('\((.*?)\)',s)
[u"date='2/xc2/xb2',time='/case/test.png'", u'eee']
Building on tkerwin's answer, if you happen to have nested parentheses like in
st = "sum((a+b)/(c+d))"
his answer will not work if you need to take everything between the first opening parenthesis and the last closing parenthesis to get (a+b)/(c+d)
, because find searches from the left of the string, and would stop at the first closing parenthesis.
To fix that, you need to use rfind
for the second part of the operation, so it would become
st[st.find("(")+1:st.rfind(")")]
import re
fancy = u'abcde(date=\'2/xc2/xb2\',time=\'/case/test.png\')'
print re.compile( "\((.*)\)" ).search( fancy ).group( 1 )
contents_re = re.match(r'[^\(]*\((?P<contents>[^\(]+)\)', data)
if contents_re:
print(contents_re.groupdict()['contents'])
No need to use regex .... Just use list slicing ...
string="(tidtkdgkxkxlgxlhxl) ¥£%#_¥#_¥#_¥#"
print(string[string.find("(")+1:string.find(")")])
TheSoulkiller's answer is great. just in my case, I needed to handle extra parentheses and only extract the word inside the parentheses. a very small change would solve the problem
>>> s=u'abcde((((a+b))))-((a*b))'
>>> re.findall('\((.*?)\)',s)
['(((a+b', '(a*b']
>>> re.findall('\(+(.*?)\)',s)
['a+b', 'a*b']
Here are several ways to extract strings between parentheses in Pandas with the \(([^()]+)\)
regex (see its online demo) that matches
\(
- a(
char([^()]+)
- then captures into Group 1 any one or more chars other than(
and)
\)
- a)
char.
Extracting the first occurrence using Series.str.extract
:
import pandas as pd
df = pd.DataFrame({'Description':['some text (value 1) and (value 2)']})
df['Values'] = df['Description'].str.extract(r'\(([^()]+)\)')
# => df['Values']
# 0 value 1
# Name: Values, dtype: object
Extracting (finding) all occurrences using Series.str.findall
:
import pandas as pd
df = pd.DataFrame({'Description':['some text (value 1) and (value 2)']})
df['Values'] = df['Description'].str.findall(r'\(([^()]+)\)')
# => df['Values']
# 0 [value 1, value 2]
# Name: Values, dtype: object
df['Values'] = df['Description'].str.findall(r'\(([^()]+)\)').str.join(', ')
# => df['Values']
# 0 value 1, value 2
# Name: Values, dtype: object
Note that .str.join(', ')
is used to create a comma-separated string out of the resulting list of strings. You may adjust this separator for your scenario.
testcase
s = "(rein<unint>(pBuf) +fsizeof(LOG_RECH))"
result
['pBuf', 'LOG_RECH', 'rein<unint>(pBuf) +fsizeof(LOG_RECH)']
implement
def getParenthesesList(s):
res = list()
left = list()
for i in range(len(s)):
if s[i] == '(':
left.append(i)
if s[i] == ')':
le = left.pop()
res.append(s[le + 1:i])
print(res)
return res
If im not missing something, a small fix to @tkerwin: s[s.find("(")+1:s.rfind(")")]
The 2nd find should be rfind so you start search from end of string
精彩评论