Inbuilt Regex class or Parser.How to extract text between the tags from html file?
I have html file in which there is table content and other information in my c#.net application.
I want to parse the table contents for only some columns.Then should I use parser of html or Replace method of Regex in .net ?
And if I use the parser then how to use parser? Will parser extract the inforamation which is between the tags? If yes then how to use ? If possible show the example because I am new to parser.
If I use Replace method of Regex class then in that method how to pass the file name for which I want to extract 开发者_如何学Gothe information ?
Edit : I want to extract information from the table in html file. For that how can I use html agility parser ? What type of code I should write to use that parser ?
You just asked an almost identical question and deleted it. Here was the answer I gave before:
Try the HTML Agility Pack.
Here's an example:
HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm");
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
{
HtmlAttribute att = link["href"];
att.Value = FixLink(att);
}
doc.Save("file.htm");
Regarding your extra question regarding regex: do not use Regex to parse HTML. It is not a robust solution. The above library can do a much better job.
HtmlAgilityPack....
Next time - search for an answer before. This is duplicate for sure.
Little tutorial.
精彩评论