开发者

html table to CSV, formatting problem in csv

I am stuck with idea on creating proper CSV from an html table. I am using HTMLAgilityPack to read the html from string and create a HTMLDocument. Then I am using XPATH to loop through rows and columns.

The problem is that I am unable to determine the correct row and cell(x,y) for a particular cell.

Example HTML:

<html>
<body>
    <table border="1">
        <tr>
            <td rowspan="2">
                100
            </td>
            <td>
                200
            </td>
            <td colspan="2">
                300
            </td>
        </tr>
        <tr>
            <td colspan="2">
                400
            </td>
            <td>
                600
            </td>
        </tr>
        <tr>
            <td>
                400
            </td>
            <td>
                500
            </td>
            <td>
                600
            </td>
        </tr>
    </table>
</body>
</html>

Image of Table

When I open it in excel and save as CSV, I do get t开发者_JS百科he desired output, which is:

100,200,300,
,400,,600
400,500,600,

Can someone help me create the same output in .Net respecting the rowpan and colspan?

Thanks! Dex


You don't need to know which row and column are you on. All you need to do is add a "," for each new column you found and a breakline every time you reach the end of a row.

If you navigate through the document considering it an xml document all you have to do is go through all TR nodes adding a breakline when you reach the end of the child nodes list. And iterate through all TD nodes on each TR node adding a "," when necessary.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜