PHP DomDocument, DomXPath encoding issue
I'm having a problem with encoding from a wordpress feed that I just can't seem to figure out.
I was loading my feed with DOMDocument->load but then did a file_get_contents and am now using ->XMLload with the same results. I did the XMLload so I could manipulate the feed if needed.
The correct output that I'm looking for is - ‘ £
.
If I just echo from a Xpath query, I get - ‘ £
.
If I echo w开发者_如何学运维ith utf8_decode I get - ? £
.
A lot better but the question mark should be an apostrophe.
If I loop through each node of the DomDocument when it is loaded, I get the correct output. So it seems that it's being handled incorrectly in XPath.
Any thought?
The feed is http://shredeasy.com/blog/category/news/feed
Here is the function that is being called:
function getPostsInCategory($feed=NULL){
if(is_null($feed)){ echo "Wrong Usage. Need a valid Category Feed. Most likely from getCategories()."; return false; }
$feedx = file_get_contents($feed);
$xml = new DOMDocument();
$xml->loadXML($feedx);
//$this->showDOMNode($xml);
//$xml->load($feed);
$xpath = new DomXPath($xml);
$xpath->registerNamespace("content", "http://web.resource.org/rss/1.0/modules/content/");
$cat = array();
foreach($xml->getElementsByTagName('item') as $c){
$elements = array();
$elements["title"] = $xpath->query("title", $c)->item(0)->nodeValue;
echo utf8_decode($elements["title"]);
I have been trying to figure this out for hours and I keep circling back to the wrong thing.
Thanks for the help!
You know right, it seems to be that apostrophes are turning into question marks....Gosh! I don't know if that's the only issue or not.
The string being echoed is encoded in UTF-8.
- If your page was encoded in UTF-8, you can just echo it, possibly calling
htmlspecialchars
with the third argument set to "UTF-8". - Otherwise, you have to convert it before to whatever encoding your webpage is using. See
iconv
andmb_convert_encoding
.
精彩评论