开发者

Double Xpath with PHP not working

I'm on php and successfully loaded an HTML-Document via a URL. Now I processed a frist XPath also successfully but my second one on the same DOMDocument() seems to fail all the times, no errors but only no results. Is it my code or any other thing I'm missing (I'm trying to testwise scrape information from an App-Store Site from Apple, in fact the description of an specified application:

//retrieving description
$path2 = "//div[@class='product-review'][1]/p[@class='truncate开发者_如何转开发']";
$result_row = $xpath->query($path2);
print_r($result_row);
foreach($result_row as $rows){
  echo "haben was";
  print_r($rows);
  $desc = $rows->childNodes->item(0)->textContent();
}


You can get pretty much everything but the customer reviews from the AppStore by using the public API:

$appStore = json_decode(
    file_get_contents(
        'http://ax.itunes.apple.com/WebObjects/MZStoreServices.woa/wa/wsLookup?id=387851294'
    )
);
echo $appStore->results[0]->description;

Example of full Json Result


This seems to be a namespace issue. Your example HTML source begins with

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.apple.com/itms/" lang="de">

The xmlns means that document has a default namespace, so all your XPath queries need to refer to that namespace in order to find any elements. (Funnily they claim in the doctype that this is a XHTML document, yet they have failed to set it in the XHTML namespace.)

You need to register the default namespace used by <html>. Because <html> is in the default namespace it does not have any prefix but in order to your XPath to work, you need to bind also this namespace to some prefix and then use that prefix in your XPath expression.

$your_xml_doc->registerXPathNamespace("ns", "http://www.apple.com/itms/");
$path2 = "//ns:div[@class='product-review'][1]/ns:p[@class='truncate']";

XPath (1.0) expressions without a namespace prefix always match only to targets in no-namespace.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜