开发者

Uses of writing Lexers and Parsers other than for Compilers?

What kind of problems other than writing compilers can be solved using Lexers and Parsers ?

What are the advantages / disadvantages of using Lexers and Parsers over jus开发者_JS百科t writing regular expression statements in a programming language.

Are there any situations where only a Lexer or only a Parser is used ?

PS: Precise Comparison Examples would be nice


Lexers and parsers are good for computerized interpretation of anything that is a context-free language but not a regular language.

In more practical terms, this means that they're good for interpreting anything that has a defined structure but is beyond the capabilities of (or more difficult to do with) regex.

For instance, it is difficult if not impossible to write a regular expression which will determine if a given document is valid HTML (due to things like tag nesting, escape characters, required attributes, et cetera). On the other hand, it's (relatively) trivial to write a parser for HTML.

Similarly, you would probably not want to even try to write a regex to determine the order of operations in a mathematical expression. On the other hand, a parser can do it easily.


As for your question regarding individual lexers or parsers:

Neither is "necessary" for the other, or at all.

For instance, one could have human-readable words which translate directly to machine opcodes that would get lexed directly into machine code (this would essentially be a very basic "assembly language"). This would not require a parser.

One could also simply write programs in a way that already was expressed in machine-readable individual symbols and thus easy for a machine to parse - for instance, boolean algebra expressions that used only the symbols 0, 1, &, |, ~, (, and ). This would not require a lexer.

Or you could do without either - for instance, Brainfuck needs neither lexing nor parsing because it is simply a set of ordered instructions; the interpreter just maps symbols to things to do. Machine opcodes, similarly, do not require either.

Mostly, lexers and parsers are written to make things nicer and easier. It's nicer not to have to write everything in individual single-meaning glyphs. It's easier to be able to write out complex expressions in whatever way is convenient (say, with parentheses, (3+4)*2) than it is to force ourselves to write them in ways that machines work (say, RPN: 3 4 + 2 *).


A famous example where parsing is more adapted than regular expressions (because the object of processing is, inherently, a non-regular context-free language) is X?(HT)?ML manipulation. See Jeff Atwood's famous blog post on the subject, derived from a famous answer on this site.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜