开发者

Get only the body of an HTML email in PHP

So I have a PHP script which takes in piped emails, appends a footer to them and passes them on.

But if someone sends an email which is already in HTML format it just inserts the entire html email into the body of a new html document. I need a script which will (if the email is already in HTML) take off the html, head and body t开发者_开发技巧ags leaving the original email.

I.e. if someone sent an email

<html><body>This is my awesome input email which is <strong>already</strong> in HTML</body></html>

It is parsed by my script to become

<html><body><html><body>This is my awesome input email which is <strong>already</strong> in HTML</body></html></body></html>

How can I get it to take off the HTML structure if it exists?


I don't think it's possible to detect if the html element is present when working with DOMDocument and HTML because loadHTML() will add its own html element if it is not present.

The code below will just always return the serialised HTML of the body element.

$dom = new DOMDocument;

$dom->loadHTML($html);

$body = '';

foreach($dom->getElementsByTagName('body')->item(0)->childNodes as $child) {
    $body .= $dom->saveHTML($child);
}

CodePad.

Alternatively, you could treat the HTML as XML and then detect it, but without a documentElement you may have problems. I solved that by adding a dummy documentElement, though it's a bit clunky (I'd probably stick to the above code myself).

// Need a documentElement so wrap it with some generic garbage.
$html = '<garbage>' . $html . '</garbage>';

$dom = new DOMDocument;

$dom->loadXML($html);

if ($dom->getElementsByTagName('html')->length) {
   ...
}

CodePad.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜