开发者

Regular expression for bounce email message

I am looking for a regular expression (or other method if there is such a thing) for detecting bounce email messages. So far I have been going through our unattended mail box and adding strings that I find into a regex. I figured someone would have something that is already complete rather than me re-inventing the wheel.

Here is an example of what I have so far:

/reason: 550|permanent fatal errors|Error 550|Action: Faile开发者_如何学Pythond|Mailbox does not exist|Delivery to the following recipients failed/i


Email servers are too varied for this to work 100%, but you might have better luck if you were looking in the headers of the message, instead of it's body, as the headers are meant to be machine readable, unlike the body.

I'd start by looking for any headers with 'error' in them.


It may be overkill for your case, but the most accurate solution is probably to use a spam filtering tool: they all need to be able to handle bounces gracefully, and they will have put a lot of effort into reducing false positives.

I would suggest SpamAssassin, personally. It is packaged as a perl module with a command-line interface "spamassassin" that can probably be coerced to do what you need it to. The bounce message rule is called (unsurprisingly) BOUNCE_MESSAGE. It is, unfortunately, not as simple as a regular expression you can copy.


You're probably better off looking at the full headers for some bounced messages and identifying common elements in the X headers that the server may have included. This is going to get you a lot less false-positives than subject line parsing.


Generate an unique Return-path: email address for each recipient email. Have a catch-all account on that POP3 server and match them. Basically this is VERP.


This works for me and covers pretty much all hard bounces. This is Perl, but you can roll your own using this Regex pretty safely.

my $content = 'EMAIL MESSAGE HEADER AND BODY';
if (
$content =~ m/Status: 5\.\d\.\d/i || # Any 5xx error
$content =~ m/Action: Failed/i ||
$content =~ m/Reason: 5\.\d\.\d/i || # Any 5xx error
$content =~ m/MAILER-DAEMON/i ||
$content =~ m/Mailbox does not exist/i ||
$content =~ m/No Such User/i ||
$content =~ m/Delivery to the following recipients failed/i ||
$content =~ m/Recipient address rejected/i ||
$content =~ m/Host or domain name not found/i ||
$content =~ m/mailbox unavailable/i
){

# Extract email address from FINAL-RECIPIENT header:
$content =~ s/^.*?final-recipient:\s?rfc822;?\s?([^\n]+).*?$/$1/is;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜