python re: r'\b \$ \d+ \b' won't match 'aug 12, 2010 abc $123'
so i'm just making a script to collect $ values from a transaction log type file
for line in sys.stdin:
match = re.match( r'\b \$ (\d+) \b', line)
if match is not None:
for value in match.groups():
prin开发者_如何学Ct value
right now I'm just trying to print those values it would match a line containing $12323 but not when there are other things in the line From what I read it should work, but looks like I could be missing something
re.match
:
If zero or more characters at the beginning of string match this regular expression, return a corresponding MatchObject instance. Return None if the string does not match the pattern; note that this is different from a zero-length match.
What your are looking for is either re.search
or re.findall
:
#!/usr/bin/env python
import re
s = 'aug 12, 2010 abc $123'
print re.findall(r'\$(\d+)', s)
# => ['123']
print re.search(r'\$(\d+)', s).group()
# => $123
print re.search(r'\$(\d+)', s).group(1)
# => 123
By having a space between \$
and (\d+)
, the regex expects a space in your string between them. Is there such a space?
I am not so clear what is accepted for you but from statement
a line containing $12323 but not when there are other things in the line
I would get that
'aug 12, 2010 abc $123'
Is not supposed to match as it has other text befor the amount.
If you want to match amount at end of the line here is the customary anti-regexp answer (even I am not against of using them in easy cases):
loglines = ['aug 12, 2010 abc $123', " $1 ", "a $1 amount", "exactly $1 - no less"]
# match $amount at end of line without other text after
for line in loglines:
if '$' in line:
_,_, amount = line.rpartition('$')
try:
amount = float(amount)
except:
pass
else:
print "$%.2f" % amount
Others have already pointed out some shortcomings of your regex (especially the mandatory spaces and re.match
vs. re.search
).
There is another thing, though: \b
word anchors match between alphanumeric and non-alphanumeric characters. In other words, \b \$
will fail (even when doing a search instead of a match operation) unless the string has some alphanumeric characters before the space.
Example (admittedly contrived) to work with your regex:
>>> import re
>>> test = [" $1 ", "a $1 amount", "exactly $1 - no less"]
>>> for string in test:
... print(re.search(r"\b \$\d+ \b", string))
...
None
<_sre.SRE_Match object at 0x0000000001DD4370>
None
精彩评论