How can I extract email addresses from between '<' and '>'?

2023-03-29 17:52 问答作者：

I've got a list of emails and names from Outlook, semicolon delimited, like this:

fname lname <email>; fname2 lname2 <email2>; ... ; fnameN lnameN <emailN>

And I'd like to extract the emails and semicolon delimit them like this:

email1; email2;开发者_运维问答 ... ; emailN

How can I do this in Python?

Using regex:

import re
# matches everything which is between < and > (excluding them)
ptrn = re.compile("<([^>]+)>")
# findall returns ['email','email2']. Join concats them.
print '; '.join(ptrn.findall("fname lname <email>; fname2 lname2 <email2>;"))
# email; email2

Using list comprehension:

em = "fname lname <email>; fname2 lname2 <email2>; fnameN lnameN <emailN>"
email_list = [entry.split()[-1][1:-1] for entry in em.split(';')]

# email_list:
# ['email', 'email2', 'emailN']

Breakdown:

for entry in em.split(';')

First it splits up the original string by the semi-colon.

entry.split()

Next it takes each entry, splits it again, this time by space.

entry.split()[-1]

Next it selects the last entry from the split, which is your email.

entry.split()[-1][1:-1]

This takes your email, which is in the form of "<email@addr.com>" and selects the string contained within the angle brackets ([1:-1] corresponds to select from the second character to second-to-last).

variations on a given theme:

s = 'fname lname <email>; fname2 lname2 <email2>; ... ; fnameN lnameN <emailN>'

print [ s[i+1 : i+s[i:].find('>')] for i,c in enumerate(s) if c == '<' ]

# OR

gen = ( i for i,c in enumerate(s) if c in '<>' )
print [ s[a+1:gen.next()] for a in gen]

继续阅读：email python string-formatting text-processing

How can I extract email addresses from between '<' and '>'?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？