开发者

How is the DOM parsed? [duplicate]

This question already has answers here: Closed 12 years ago.

Possible Duplicate:

If you're not supposed to use Regular Expressions to parse HTML, then how are HTML parsers written?

My question is simple: How do current DOM parsers actually parse the DOM from a string (XML, HTML, or otherwise)?

I know you shouldn't parse html with RegEx, but couldn't a DOM parser use RegEx to match patterns for open/close tags? Or, is there a good once-over algorithm for parsing the provided string a开发者_如何学Gos a character array?


Look at this:

  • How do HTML parses work if they're not using regexp?

  • Parsing HTML documents:

How is the DOM parsed? [duplicate]

Here is a good Example


Well, you could start with a basic approach along the lines of:

http://www.blackbeltcoder.com/Articles/strings/parsing-html-tags-in-c

And then just expand it to store everything into the full DOM tree structure.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜