Parsing Tabular cell data with space where there is td tag
I am parsing html tabular information with the help of the html agility pack. Now First I am finding the rows in that table like
var rows = table.Descendants("tr");
then I find the cell data for each row like
foreach(var row in rows)
{
string rowInnerText = row.InnerText;
}
That 开发者_开发知识库gives me the cell data.But with no spaces between them like NameAdressPhone No but I want the innertext like Name Address Phone No means where there is td tag I want to keep there one space between different column cell.
Here is an idea, however completely untested:
var rows = table.Descendants("tr").Select(tr =>
string.Join(" ", tr.Descendants("td").Select(td => td.InnerText).ToArray()));
This should give you en IEnumerable<string>
where each contained element represents one row from the table, in the format described in your question. If you actually need your loop over the rows for other processing, keep your foreach
loop and use the LINQ magic in its body:
var rows = table.Descendants("tr");
foreach (var row in rows)
{
string rowInnerText = string.Join(" ",
row.Descendants("td").Select(td => td.InnerText).ToArray());
}
精彩评论