Using RE to retrieve an ID
I am trying to use RE to match a changing ID and extract it. I am having some bother getting it working. T开发者_如何学运维he String is:
m = 'Some Text That exists version 1.0.41.476 Fri Jun 4 16:50:56 EDT 2010'
The code I have tried so far is:
r = re.compile(r'(s*\s*)(\S+)')
m = m.match(r)
Can anyone help extract this string.
Thanks
>>> m = 'Some Text That exists version 1.0.41.476 Fri Jun 4 16:50:56 EDT 2010'
>>> import re
>>> re.search(r'version (\S+)', m).group(1)
('1.0.41.476',)
Here are RE-based and string-based versions:
import re
def bystr(text):
words = text.split()
index = words.index('version') + 1
return words[index]
def byre(text, there=re.compile(r'version\s+(\S+)')):
return there.search(text).group(1)
m = 'Some Text That exists version 1.0.41.476 Fri Jun 4 16:50:56 EDT 2010'
if __name__ == '__main__':
print bystr(m)
print byre(m)
(run as main script to confirm they return the same result -- a string, not a tuple as an existing answer peculiarly shows), and here's the timing of each (on my slow laptop):
$ python -mtimeit -s'import are' 'are.bystr(are.m)'
100000 loops, best of 3: 4.29 usec per loop
$ python -mtimeit -s'import are' 'are.byre(are.m)'
100000 loops, best of 3: 3.25 usec per loop
While RE often have a bad reputation in the Python community, even this simple example shows that, when appropriate, they can often be faster than simple string manipulation -- in this case, the RE version takes only about 3/4 of the time that the string version takes.
You don't necessarily have to use a regular expression to extract a substring.
def get_version_number(text):
"""Assumes that the word 'version' appears before the version number in the
text."""
words = text.split()
index = words.index('version') + 1
return words[index]
if __name__ == '__main__':
m = 'Some Text That exists version 1.0.41.476 Fri Jun 4 16:50:56 EDT 2010'
print get_version_number(m)
print repr(get_version_number(m))
精彩评论