开发者

Parsing HTML with XPath/XMLHttpRequest

I'm trying to download an HTML page, and parse it using XMLHttpRequest(on the most recent Safari browser). Unfortunately, I can't get it to work!

var url = "http://google.com";

xmlhttp = new XMLHttpRequest();
xmlhttp.open("GET", url);

xmlhttp.onreadystatechange  = function(){
    if(xmlhttp.readyState==4){
        response = xmlhttp.responseText;
        var doc = new DOMParser().parseFromString(response, "text/xml");
        console.log(doc);
        var nodes = document.evaluate("//a/text()",doc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,null);
        console.log(nodes);
        console.log(nodes.snapshotLength);
        for(var i =0; i<nodes.snapshotLength; i++){
            thisElement = nodes.snapshotItem(i);
            console.log(thisElement.nodeName);
        }
    }
};
xmlhttp.send(null);

The text gets downloaded successfully(response contains the valid HTML), and is parsed into a tree cor开发者_Go百科rectly(doc represents a valid DOM for the page). However, nodes.snapshotLength is 0, despite the fact that the query is valid and should have results. Any ideas on what's going wrong?


If you are using either:

  • a JS library or
  • you have a modern browser with the querySelectorAll method available (Safari is one)

You can try to use CSS selectors to parse the DOM instead of XPATH.


HTML is not XML. The two are not interchangeable. Unless the "HTML" is actually XHTML, you will not be able to use XPATH to process it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜