开发者

Html string reader

I need to load HTML and parse it, I think that it should be something simple, I pass a st开发者_StackOverflowring with a "HTML" it reads the string in a Dom like object, so I can search and parse the content of the HTML, facilitating scraping and things like that.

Do you guys know about any thing like that.

Thanks


HTML Agility Pack

Similar API to XmlDocument, for example (from the examples page):

 HtmlDocument doc = new HtmlDocument();
 doc.Load("file.htm");
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
    HtmlAttribute att = link["href"];
    att.Value = FixLink(att);
 }
 doc.Save("file.htm");

(you should also be able to use LoadHtml to load a string of html, rather than from a path)


If you're running in-browser, you should be able to use the Html DOM Bridge, load the HTML into it, and walk the DOM Tree for that.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜