Get only the body of an HTML email in PHP
So I have a PHP script which takes in piped emails, appends a footer to them and passes them on.
But if someone sends an email which is already in HTML format it just inserts the entire html email into the body of a new html document. I need a script which will (if the email is already in HTML) take off the html, head and body t开发者_开发技巧ags leaving the original email.
I.e. if someone sent an email
<html><body>This is my awesome input email which is <strong>already</strong> in HTML</body></html>
It is parsed by my script to become
<html><body><html><body>This is my awesome input email which is <strong>already</strong> in HTML</body></html></body></html>
How can I get it to take off the HTML structure if it exists?
I don't think it's possible to detect if the html
element is present when working with DOMDocument and HTML because loadHTML()
will add its own html
element if it is not present.
The code below will just always return the serialised HTML of the body
element.
$dom = new DOMDocument;
$dom->loadHTML($html);
$body = '';
foreach($dom->getElementsByTagName('body')->item(0)->childNodes as $child) {
$body .= $dom->saveHTML($child);
}
CodePad.
Alternatively, you could treat the HTML as XML and then detect it, but without a documentElement
you may have problems. I solved that by adding a dummy documentElement
, though it's a bit clunky (I'd probably stick to the above code myself).
// Need a documentElement so wrap it with some generic garbage.
$html = '<garbage>' . $html . '</garbage>';
$dom = new DOMDocument;
$dom->loadXML($html);
if ($dom->getElementsByTagName('html')->length) {
...
}
CodePad.
精彩评论