开发者

how to parse a string with html tags in its substrings which are bold, italic, underlined

I created some kind of text rendering tool for a 2D graphics framework in c#.

Now i was trying to parse a text with specific html tags in it, like:

"Hello <b>world</b>!" 

But the parsing code was getting ugly and I thought, there must be some lib that does exactly that. At the end it should output an array of data stru开发者_JAVA技巧ctures like:

string text;
bool IsBold;
bool IsItalic;
bool IsUnderlined;
...

or

string text;
FontStyle FontStyle;

Anyone know of such a parser?

Thanks a lot!


The HTML Agility Pack is a good HTML parser (and also parses fragments).

You can query it using XPath syntax (it is similar to XmlDocument) - not sure how good a fit it will be for your requirements.


Tidy.net is a fantastic tool which is a port from the original Tidy project which is used in the HTML Tidy firefox plugin. Run your code through Tidy and it will return clean, compliant html.


I do not know how this would work, but here are some HTML parsers:
html_parse
htmlagilitypack

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜