开发者

How to import XML string in a php DOMDocument

For exemple, i create a DOMDocument like that :

<?php

$implementation = new DOMImplementation();

$dtd =
  $implementation->createDocumentType
  (
    'html',                                     // qualifiedName
    '-//W3C//DTD XHTML 1.0 Transitional//EN',   // publicId
    'http://www.w3.org/TR/xhtml1/DTD/xhtml1-'
      .'transitional.dtd'                       // systemId
  );

$document = $implementation->createDocument('', '', $dtd);

$elementHtml     = $document->createElement('html');
$elementHead     = $document->createElement('head');
$elementBody     = $document->createElement('body');
$elementTitle    = $document->createElement('title');
$textTitre       = $document->createTextNode('My bweb page');
$attrLang        = $document->createAttribute('lan开发者_如何学JAVAg');
$attrLang->value = 'en';

$document->appendChild($elementHtml);
$elementHtml->appendChild($elementHead);
$elementHtml->appendChild($attrLang);
$elementHead->appendChild($elementTitle);
$elementTitle->appendChild($textTitre);
$elementHtml->appendChild($elementBody);

So, now, if i have some xhtml string like that :

<?php
$xhtml = '<h1>Hello</h1><p>World</p>';

How can i import it in the <body> node of my DOMDocument ?

For now, the only solution I've found, is something like that :

<?php
$simpleXmlElement = new SimpleXMLElement($xhtml);

$domElement = dom_import_simplexml($simpleXmlElement);

$domElement = $document->importNode($domElement, true);

$elementBody->appendChild($domElement);

This solution seems very bad for me, and create some problemes, like when I try with a string like that :

<?php
$xhtml = '<p>Hello&nbsp;World</p>';

Ok, I can bypass this problem by converting xhtml entities in Unicode entities, but it's so ugly...

Any help ?

Thanks by advance !

Related question :

  • DOMDocument::validate() problem (solved)


The problem is DOM does not know that it should consider the XHTML DTD unless you validated the document against it. Unless you do that, DOM doesnt know any entities defined in the DTD, nor any other rules in it. Fortunately, we sorted out how to do the validation in that other question, so armed with that knowledge you can do

$document->validate(); // anywhere before importing the other DOM

And then import with

$fragment = $document->createDocumentFragment();
$fragment->appendXML('<h1>Hello</h1><p>Hello&nbsp;World</p>');
$document->getElementsByTagName('body')->item(0)->appendChild($fragment);
$document->formatOutput = TRUE;
echo $document->saveXml();

outputs:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>My bweb page</title>
  </head>
  <body>
    <h1>Hello</h1>
    <p>Hello&nbsp;World</p>
  </body>
</html>

The other way to import XML into another DOM is to use

$one = new DOMDocument;
$two = new DOMDocument;
$one->loadXml('<root><foo>one</foo></root>');
$two->loadXml('<root><bar><sub>two</sub></bar></root>');
$bar = $two->documentElement->firstChild; // we want to import the bar tree
$one->documentElement->appendChild($one->importNode($bar, TRUE));
echo $one->saveXml();

outputs:

<?xml version="1.0"?>
<root><foo>one</foo><bar><sub>two</sub></bar></root>

However, this cannot work with

<h1>Hello</h1><p>Hello&nbsp;World</p>

because when you load a document into DOM, DOM will overwrite everything you told it before about the document. Thus, when using load, libxml (and thus SimpleXml, DOM and XMLReader) does (do) not know you mean XHTML. And it does not know any entities defined in it and will fuzz about them instead. But even if the string would not contain the entity, it is not valid XML, because it lacks a root node. That's why you use the fragment.


You can use a DomDocumentFragment for this:

$fragment = $document->createDocumentFragment();
$fragment->appendXml($xhtml);
$elementBody->appendChild($fragment);

That's all there is to it...

Edit: Well, if you must have xhtml (instead of valid xml), you could do this dirty workaround:

function xhtmlToDomNode($xhtml) {
    $dom = new DomDocument();
    $dom->loadHtml('<html><body>'.$xhtml.'</body></html>');
    $fragment = $dom->createDocumentFragment();
    $body = $dom->getElementByTagName('body')->item(0);
    foreach ($body->childNodes as $child) {
        $fragment->appendChild($child);
    }
    return $fragment;
}

usage:

$fragment = xhtmlToDomNode($xhtml);
$document->importNode($fragment, true);
$elementBody->appendChild($fragment);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜