开发者

Truncate a string with an ellipsis, making sure not to break any HTML entity

I have a database of items with XHTML content and I want to display the items with the HTML stripped off (done) and then 开发者_C百科truncate each item to a maximum length of 100 characters. If the string exceeds 100 characters, I cut it off and insert … (an ellipsis) at the end.

The problem is that my program doesn't understand HTML entities that are already in the string. E.g. if the string is something & something, my function may truncate it as something &am... resulting in invalid XHTML.

What is the best way to go about this problem in ASP.NET/C#?


You could use HtmlDecode to convert html entities to normal string, then truncate this string and finally encode the result:

var decoded = HttpUtility.HtmlDecode(theEncodedString);
decoded = Truncate(decoded);
var result = HttpUtility.HtmlEncode(decoded);


You could use a regular expression to match either an HTML entity or a single character, and repeat up to the length that you want. Something like:

^(&\w+;|.){,100}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜