开发者

c# rendering html into text

i want to be able to take html code and render plain text out of it.

ano开发者_如何学运维ther words this would be my input

<h3>some text</h3>

i want the result to look like this:

some text

how would i do it?


I would suggest trying the HTML Agility Pack for .NET:

Html Agility Pack - Codeplex

Attemtping to parse through HTML with anything else is, for the most part, unreliable.

Whatever you do, DON'T TRY TO PARSE HTML WITH REGEX!


Use regex.

String result = Regex.Replace(your_text_goes_here, @"<[^>]*>", String.Empty);


You would need to use some form of HTML parser. You could use an existing Regex or build your own. However, they aren't always 100% reliable. I would suggest using a 3rd party utility like HtmlAgilityPack (I have used this one and would recommend it)


Poor Man's HTML Parser

        string s =
            @"
            <html>
            <body>
            <h1>My First Heading</h1>
            <p>My first paragraph.</p>
            </body>
            </html> 
        ";

        foreach (var item in s.Split(new char[]{'<'}))
        {
            int x = item.IndexOf('>');

            if (x != -1)
            {
                Console.WriteLine(item.Substring(x).Trim('>'));
            }
        }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜