开发者

Entity declaration in php generated xml document (  — etc)

This is driving me nuts, lots of similar problems out there on the web, but I can't find the right solution.

I am creating an xml document in php to be sent as the response to an ajax request. The response will look something like this:

<?xml version="1.0" encoding="iso-8859-1"?>
<response>
  <status>success</status>
  <message>&nbsp;&m开发者_如何转开发dash;</message>
</response>

The tag will contain more meaningful information than that, but it's the entities like those that are giving me the problem.

The php code that generates that xml is below:

header("Content-Type: text/xml");

$dom = new DOMDocument('1.0', 'iso-8859-1');
$dom->formatOutput = true;

$response_node = $dom->createElement("response");
$dom->appendChild($response_node);
$response_node->appendChild($dom->createElement('status', 'success'));
$response_node->appendChild($dom->createElement('message', "&nbsp;&mdash"));
echo $dom->saveXML();
return;

The xml shown above is successfully returned to the javascript function that made the call, but when it tries to parse the xml document, it fails.

If I try to validate the xml using this validator I get the following error:

This page contains the following errors:

error on line 5 at column 15: Entity 'nbsp' not defined

The entity &mdash; causes the same problem.

I think I may need to find a way to put something like this in the xml:

<!ENTITY name "entity_value">

I'm not sure how to do this though, or if it's the right way to go about it. Am I not the right track? If so how do I do it? If not, what is the right way to go about solving this problem?


HTML entity names are not valid in XML without defining them with <!ENTITY name "..."> as you pointed out. But numeric entities will do the trick.

Try replacing:

&nbsp; => &#xA0;

&mdash; => &#x2014;


This is one way to solve the problem, add a doctype declaration that defines the entities:

$dom = new DOMDocument('1.0', 'iso-8859-1');
$dom->formatOutput = true;
$doctype = DOMImplementation::createDocumentType("html","-//W3C//DTD XHTML 1.1//EN","http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd");
$dom->appendChild($doctype);

$response_node = $dom->createElement("response");
$dom->appendChild($response_node);
$response_node->appendChild($dom->createElement('status', 'success'));
$response_node->appendChild($dom->createElement('message', "&nbsp;&mdash"));
echo $dom->saveXML();
return;


— and the non-breaking space are perfectly UTF-8 valid characters, allowed in XML.

If your original message contains it and were converted to an entity to be displayed in your XML, specify you want to convert characters for XML, not for HTML:

PHP 5.4.0+:

$encoded_value = htmlentities($value, ENT_COMPAT | ENT_XML1);

In older PHP versions, the default encoding is ISO-8859-1, so specify UTF-8 as encoding:

$encoded_value = htmlentities($value, ENT_COMPAT | ENT_XML1, 'UTF-8');

Note: you can use the html_entity_decode function to get — from the mdash entity.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜