开发者

python and regex


#!/usr/bin/python
import re

str = raw_input("String containing email...\t")
开发者_运维知识库match = re.search(r'[\w.-]+@[\w.-]+', str)
 if match:
  print match.group()

it's not the most complicated code, and i'm looking for a way to get ALL of the matches, if it's possible.


It sounds like you want re.findall():

findall(pattern, string, flags=0)
    Return a list of all non-overlapping matches in the string.

    If one or more groups are present in the pattern, return a
    list of groups; this will be a list of tuples if the pattern
    has more than one group.

    Empty matches are included in the result.

As far as the actual regular expression for identifying email addresses goes... See this question.

Also, be careful using str as a variable name. This will hide the str built-in.


I guess that re.findall is what you're looking for.


You should give a try for find() or findall()

findall() matches all occurrences of a pattern, not just the first one as search() does. For example, if one was a writer and wanted to find all of the adverbs in some text, he or she might use findall()

http://docs.python.org/library/re.html#finding-all-adverbs


  • You don't use raw_input in the way you used. Just use raw_input to get the input from the console.
  • Don't override built-in's such as str. Use a meaningful name and assign it a whole string value.

  • Also it is a good idea many a times to compile your pattern have it a Regex object to match the string against. (illustrated in the code)

I just realized that a complete regex to match an email id exactly as per RFC822 could be a pageful otherwise this snippet should be useful.

import re

inputstr = "something@exmaple.com, 121@airtelnet.com, ra@g.net, etc etc\t"
mailsrch = re.compile(r'[\w\-][\w\-\.]+@[\w\-][\w\-\.]+[a-zA-Z]{1,4}')
matches = mailsrch.findall(inputstr)
print matches
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜