Need help with re for matching and getting the value python
Need help regarding re.
file = 'file No.WR79050107006 from files'
So what I am trying to do is validate if file string contains WR + 11 digit.
result =开发者_Go百科 re.match('^(\S| )*(?P<sr>(\d){11})(\S| )*', file)
Its validate only 11 digit but not WR before it. How can I do that?
Using re after matching how can I get the match value ( WR79050107006)
I can do string find
index = file.find('file No.')
and then get the value of next 13 char.
thanks
If by "validate if file string contains WR + 11 digit" you mean "exactly 11, not 12+",
mo = re.search(r'WR(\d{11})(\D|$)', thestring)
should do. If you actually mean "11 or more", there's no need for the (\D|$)
part (or equivalent negative lookahead, etc).
Edit: as the OP now says in a comment that there might be whitespace between the WR and the digits, this can change to
mo = re.search(r'WR\s*(\d{11})(\D|$)', thestring)
the difference, of course, is in the \s*
which means "0 or more whitespace characters here".
mo
is None
if thestring
has no such match; otherwise, mo.group(1)
gives you the 11-digits substring of interest.
Try this:
result = re.search("No.WR\d{11}", file)
>>> file = 'file No.WR79050107006 from files'
>>> for item in file.split():
... if "No.WR" in item:
... d=item.replace("No.WR","")
... if d.isdigit() and len(d) == 11:
... print "ok"
...
ok
It's not clear from your comment on Alex's answer. Is the record valid if there is a space between the WR
and the 11 digits?
Hopefully one of these examples help. Otherwise add the other variations and expected results to your question, and you should get answers that are straight to the point.
>>> import re
>>> re.findall(r'(WR\d{11})(?:\D|$)', 'file No.WR79050107006 from files')
['WR79050107006']
>>> re.findall(r'(WR)(\d{11})(?:\D|$)', 'file No.WR79050107006 from files')
[('WR', '79050107006')]
Whitespace between the WR and the 11 digits
>>> re.findall(r'(WR)(\d{11})(?:\D|$)', 'file No.WR 79050107006 from files')
[]
>>> re.findall(r'(WR)\s*(\d{11})(?:\D|$)', 'file No.WR 79050107006 from files')
[('WR', '79050107006')]
>>>
Anything between WR and the 11 digits
>>> re.findall(r'(WR).*(\d{11})(?:\D|$)', 'file No.WR!@#$%79050107006 from files')
[('WR', '79050107006')]
精彩评论