Ruby/Rails Parsing Emails
I'm currently using the follo开发者_开发百科wing to parse emails:
def parse_emails(emails)
valid_emails, invalid_emails = [], []
unless emails.nil?
emails.split(/, ?/).each do |full_email|
unless full_email.blank?
if full_email.index(/\<.+\>/)
email = full_email.match(/\<.*\>/)[0].gsub(/[\<\>]/, "").strip
else
email = full_email.strip
end
email = email.delete("<").delete(">")
email_address = EmailVeracity::Address.new(email)
if email_address.valid?
valid_emails << email
else
invalid_emails << email
end
end
end
end
return valid_emails, invalid_emails
end
The problem I'm having is given an email like:
Bob Smith <bob@smith.com>
The code above is delete Bob Smith and only returning bob@smith.
But what I want is an hash of FNAME, LNAME, EMAIL. Where fname and lname are optional but email is not.
What type of ruby object would I use for that and how would I create such a record in the code above?
Thanks
I've coded so that it will work even if you have an entry like: John Bob Smith Doe <bob@smith.com>
It would retrieve:
{:email => "bob@smith.com", :fname => "John", :lname => "Bob Smith Doe" }
def parse_emails(emails)
valid_emails, invalid_emails = [], []
unless emails.nil?
emails.split(/, ?/).each do |full_email|
unless full_email.blank?
if index = full_email.index(/\<.+\>/)
email = full_email.match(/\<.*\>/)[0].gsub(/[\<\>]/, "").strip
name = full_email[0..index-1].split(" ")
fname = name.first
lname = name[1..name.size] * " "
else
email = full_email.strip
#your choice, what the string could be... only mail, only name?
end
email = email.delete("<").delete(">")
email_address = EmailVeracity::Address.new(email)
if email_address.valid?
valid_emails << { :email => email, :lname => lname, :fname => fname}
else
invalid_emails << { :email => email, :lname => lname, :fname => fname}
end
end
end
end
return valid_emails, invalid_emails
end
Here's a slightly different approach that works better for me. It grabs the name whether it is before or after the email address and whether or not the email address is in angle brackets.
I don't try to parse the first name out from the last name -- too problematic (e.g. "Mary Ann Smith" or Dr. Mary Smith"), but I do eliminate duplicate email addresses.
def parse_list(list)
r = Regexp.new('[a-z0-9\.\_\%\+\-]+@[a-z0-9\.\-]+\.[a-z]{2,4}', true)
valid_items, invalid_items = {}, []
## split the list on commas and/or newlines
list_items = list.split(/[,\n]+/)
list_items.each do |item|
if m = r.match(item)
## get the email address
email = m[0]
## get everything before the email address
before_str = item[0, m.begin(0)]
## get everything after the email address
after_str = item[m.end(0), item.length]
## enter the email as a valid_items hash key (eliminating dups)
## make the value of that key anything before the email if it contains
## any alphnumerics, stripping out any angle brackets
## and leading/trailing space
if /\w/ =~ before_str
valid_items[email] = before_str.gsub(/[\<\>\"]+/, '').strip
## if nothing before the email, make the value of that key anything after
##the email, stripping out any angle brackets and leading/trailing space
elsif /\w/ =~ after_str
valid_items[email] = after_str.gsub(/[\<\>\"]+/, '').strip
## if nothing after the email either,
## make the value of that key an empty string
else
valid_items[email] = ''
end
else
invalid_items << item.strip if item.strip.length > 0
end
end
[valid_items, invalid_items]
end
It returns a hash with valid email addresses as keys and the associated names as values. Any invalid items are returned in the invalid_items array.
See http://www.regular-expressions.info/email.html for an interesting discussion of email regexes.
I made a little gem out of this in case it might be useful to someone at https://github.com/victorgrey/email_addresses_parser
You can use rfc822 gem. It contains regular expression for seeking for emails that conform with RFC. You can easily extend it with parts for finding first and last name.
Along the lines of mspanc's answer, you can use the mail
gem to do the basic email address parsing work for you, as answered here: https://stackoverflow.com/a/12187502/1019504
精彩评论