XML error at ampersand (&)
I have a php file which prints an xml based on a MySql db.
I get an error every time at exactly the point where there is an & sign.
Here is some php:
$query = mysql_query($sql);
$_xmlrows = '';
while ($row = mysql_fetch_array($query)) {
$_xmlrows .= xmlrowtemplate($row);
}
function xmlrowtemplate($dbrow){
return "<AD>
开发者_StackOverflow中文版<CATEGORY>".$dbrow['category']."</CATEGORY>
</AD>
}
The output is what I want, i.e. the file outputs the correct category, but still gives an error.
The error says: xmlParseEntityRef: no name
And then it points to the exact character which is a & sign.
This complains only if the $dbrow['category']
is something with an & sign in it, for example: "cars & trucks", or "computers & telephones".
Anybody know what the problem is?
BTW: I have the encoding set to UTF-8 in all documents, as well as the xml output.
&
in XML starts an entity. As you haven't defined an entity &WhateverIsAfterThat
an error is thrown. You should escape it with &
.
$string = str_replace('&', '&', $string);
How do I escape ampersands in XML
To escape the other reserved characters:
function xmlEscape($string) {
return str_replace(array('&', '<', '>', '\'', '"'), array('&', '<', '>', ''', '"'), $string);
}
$string =
htmlspecialchars
($string,
ENT_XML1
);
is the most universal way to solve all encoding errors (IMHO better that write custom functions + there is no point to solve just &
).
Credit: Put Wrikken's and joshweir's comment as answer to be more visible.
You need to either turn &
into its entity &
, or wrap the contents in CDATA tags.
If you choose the entity route, there are additional characters you need to turn into entities:
> >
< <
' '
" "
Background: Beware of the ampersand when using XML
Wikipedia: List of XML character entity references
Switch and regex with using xml escape function.
function XmlEscape(str) {
if (!str || str.constructor !== String) {
return "";
}
return str.replace(/[\"&><]/g, function (match) {
switch (match) {
case "\"":
return """;
case "&":
return "&";
case "<":
return "<";
case ">":
return ">";
}
});
};
public function sanitize(string $data) {
return str_replace('&', '&', $data);
}
You are right: here is more context - the example is in relation to the ' how to deal with data containing '&' when we pass this data to SimpleXml. Of course there is also other solution to use
<![CDATA[some stuff]]>
精彩评论