开发者

Problem querying web page parsed with HTML Agility Pack

I have the following source code snipet:

<div class = "discount_tools_row">
  <div class = "discount_tools">
    <ul> 
      <li><a href = "#" class = "share-discount" rel = "nofollow"></a></li>
      <li><a href = "/deal/map/4243683"
             class = "show-loca开发者_Python百科tion"
             title = "הראה מקום על מפה"
             data-address = "רח&#39; האצ&quot;ל 39, ראשון לציון"></a></li>
    </ul>

    <link rel = "prerender"
          href = "http:/ / www.bigdeal.co.il / ? CampaignId = 873 & sId = 10 ">
    <a class = "tavo_button"
       data-provider = "bigdeal"
       href = "http : //www.bigdeal.co.il/?CampaignId=873&sId=10"
       target="_blank"
       rel = "nofollow">תבוא!</a>
    </div>
  </div>
</div>

Using the HTML Agility Pack I want to fetch pairs of <data-address value, link rel="prerender" href value>.

I tried the following but got wrong results:

var nodes = doc.DocumentNode.SelectNodes(
    "//div[@class=\"discount_tools\"]");
var geoNodes = nodes.Where(node => !string.IsNullOrEmpty(
    node.ChildAttributes("data-address").ToString()));
AnswerFormat ans = new AnswerFormat {
    Locations = geoNodes.Select(
        node => node.ChildAttributes("data-address").ToString()).ToList(),
    //Names = nodes.Select(node => node.Attributes["data-address"].Value).
    //ToList(),
    Details = geoNodes.Select(
        node => node.ChildAttributes("data-direct-url").ToString()).ToList()
};

I was trying to achieve all

< div class = "discount_tools" >

with

data-address

attribute in thier childNode and

  <a class="tavo_button" data-provider="bigdeal" href=

in another childNode

How can I improve my query ?


That was my solution:

        var nodes = doc.DocumentNode.SelectNodes("//div[@class=\"discount_tools\"]");
        var linksCollections = nodes.Select(node => node.Descendants("a"));

        List<string> Locations = new List<string>();
        List<string> Categories = new List<string>();
        List<string> Hrefs = new List<string>();

        foreach (var col in linksCollections)
        {
            string location, category, href;
            location = GetAtt("data-address",col);
            if (!string.IsNullOrEmpty(location))
            {
                category = GetAtt("data-kind", col);
                if (!string.IsNullOrEmpty(category))
                {
                    href = GetAtt("data-provider", "href", col);
                    if (!string.IsNullOrEmpty(href))
                    {
                        Locations.Add(location);
                        Categories.Add(category);
                        Hrefs.Add(href);
                    }
                }
            }

        }


String dataAddressValue = doc.DocumentNode.SelectSingleNode("//div[@class='discount_tools']/ul/li/a[@class='show-location']").Attributes["data-address"].Value;

String LinkHrefValue = doc.DocumentNode.SelectSingleNode("//div[@class='discount_tools ']/link[@rel=’prerender’]").Attributes["href"].Value;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜