开发者

Find email domain in address with regular expressions

I know I'm an idiot, but I can't pull the domain out of this email address:

'blahblah@gmail.com'

My desired output:

'@gmail.com'

My current output:

.

(it's just a period character)

Here's my code:

import re
test_string = 'blahblah@gmail.com'
domain = re.search('@*?\.', test_string)
print domain.group()

Here's what I think my regular expression says ('@*?.', test_string):

 ' # begin to define the pattern I'm looking for (also tell python this is a string)

  @ # find all patterns beginning with the at symbol ("@")

  * # find all characters after ampersand

  ? # find the last character before the period

  \ # breakout (don't use the next character as a wild card, us it is a string character)

  . # find the "." character

  ' # end definition of the pattern I'm looking for (also tell python this is a string)

  , test string # run the preceding search on the variable "test_string," i.e., 'blahblah@gmail.com'

I'm basing this off the de开发者_开发知识库finitions here:

http://docs.activestate.com/komodo/4.4/regex-intro.html

Also, I searched but other answers were a bit too difficult for me to get my head around.

Help is much appreciated, as usual. Thanks.

My stuff if it matters:

Windows 7 Pro (64 bit)

Python 2.6 (64 bit)


PS. StackOverflow quesiton: My posts don't include new lines unless I hit "return" twice in between them. For example (these are all on a different line when I'm posting):

@ - find all patterns beginning with the at symbol ("@") * - find all characters after ampersand ? - find the last character before the period \ - breakout (don't use the next character as a wild card, us it is a string character) . - find the "." character , test string - run the preceding search on the variable "test_string," i.e., 'blahblah@gmail.com'

That's why I got a blank line b/w every line above. What am I doing wrong? Thx.


Here's something I think might help

import re
s = 'My name is Conrad, and blahblah@gmail.com is my email.'
domain = re.search("@[\w.]+", s)
print domain.group()

outputs

@gmail.com

How the regex works:

@ - scan till you see this character

[\w.] a set of characters to potentially match, so \w is all alphanumeric characters, and the trailing period . adds to that set of characters.

+ one or more of the previous set.

Because this regex is matching the period character and every alphanumeric after an @, it'll match email domains even in the middle of sentences.


Ok, so why not use split? (or partition )

"@"+'blahblah@gmail.com'.split("@")[-1]

Or you can use other string methods like find

>>> s="bal@gmail.com"
>>> s[ s.find("@") : ]
'@gmail.com'
>>>

and if you are going to extract out email addresses from some other text

f=open("file")
for line in f:
    words= line.split()
    if "@" in words:
       print "@"+words.split("@")[-1]
f.close()


Using regular expressions:

>>> re.search('@.*', test_string).group()
'@gmail.com'

A different way:

>>> '@' + test_string.split('@')[1]
'@gmail.com'


You can try using urllib

from urllib import parse
email = 'myemail@mydomain.com'
domain = parse.splituser(email)[1]

Output will be

'mydomain.com'


Just wanted to point out that chrisaycock's method would match invalid email addresses of the form

herp@

to correctly ensure you're just matching a possibly valid email with domain you need to alter it slightly

Using regular expressions:

>>> re.search('@.+', test_string).group()
'@gmail.com'


Using the below regular expression you can extract any domain like .com or .in.

import re
s = 'my first email is user1@gmail.com second email is enter code hereuser2@yahoo.in and third email is user3@outlook.com'
print(re.findall('@+\S+[.in|.com|]',s))

output

['@gmail.com', '@yahoo.in']


Here is another method using the index function:

email_addr = 'blahblah@gmail.com'

# Find the location of @ sign
index = email_addr.index("@")

# extract the domain portion starting from the index
email_domain = email_addr[index:]

print(email_domain)
#------------------
# Output:
@gmail.com
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜