PHP Regular Expressions - Cannot get my head around
I am trying to create 3 PHP regular expressions which do three things..
- Gets emails e.g mr.jones@apple-land.com
- Gets dates e.g 31/05/90 or 31-Jun-90
- Gets nameservers e.g ns1.apple.co.uk
I have a big chunk of text and want to extract these things from it.
What I have so far is:
$regexp = '/[A-Za-z0-9\.]+[@]{1}[A-Za-z0-9\.]+[A-Za-z]{2,4}/i';
preg_match_all($regexp, $output, $email);
$regexp = '/[A-Za-z0-9\.]+[^@]{1}/i';
preg_match_all($regexp, $output, $nameservers);
$regexp = '/[0-9]{2,4}[-\/]{1}([A-Za-z]{3}|[0-9]{2})[-\/]{1}[0-9]{2,4}开发者_如何学Python/i';
preg_match_all($regexp, $output, $dates);
Dates and emails work, but i dont know if that is an efficient way to do it..
Nameservers just dont work at all.. essentially I want to find any combinations of letters and numbers which have dots in between but not @ symbols..
Any help would be greatly appreciated.
Thanks
RegEx's for emails are fairly complex. This is one place where frameworks shine. Most of the popular ones have validation components which you can use to solve these problems. I'm most familiar with ZendFramework validation, and Symfony2 and CakePHP also provide good solutions. Often these solutions are written to the appropriate RFC specification and include support for things that programmers often overlook, like the fact that +
is valid in an email address. They also protect against common mistakes that programmers make. Currently, your email regex will allow an email address that looks like this: .@.qt
, which is not valid.
Some may argue that using a framework to validate an email or hostname (which can have a -
in it as well) is overkill. I feel it is worth it.
essentially I want to find any combinations of letters and numbers which have dots in between but not @ symbols..
regexp for finding all letters and numbers which have dots in between:
$regexp '/[A-Za-z0-9]{1,}(\.[A-Za-z0-9]{1,}){1,}/i'
Please note that you don't have to make it explicit you don't want '@' if what you are matching on doesn't include the @.
I would recommend using different patterns for your examples:
[\w\.-]+@\w+\.[a-zA-Z]{2,4}
for emails.\d{1,2}[/-][\da-zA-Z]{1,3}[/-]\d{2,4}
for dates.([a-zA-Z\d]+\.){2,3}[a-zA-Z\d]+
for namespaces.
Good luck ;)
For the nameservers i would suggest using: /[^.](\.[a-z_\d]+){3,}/i
精彩评论