开发者

When to write a parser using grammar vs. using language featured regular expressions

In my master's I've se开发者_StackOverflow中文版en how to write parsers, compilers using ANTLR. But in the real world, often times we have a requirement of parsing and extracting relevant content from a heavy load of in-coming stream data. Each language has it's own regular expression engine which can be conveniently used to parse the data. Alternatively we can write an EBNF grammar and take a slick tool like ANTLR to automatically generate the parser. The latter approach is less error prone and guaranteed to be more reliable than the former (especially in case of some extra spaces, new lines).

I would just like to know what would be the borderline between this 2 world's when one would go and write a whole grammar and generate his own parser vs. one quickly uses the inbuilt language regex engine and rollout a petty parser that can do the work quick enough. Again I am not looking for arguments but trying to analyze to what extent and approach people go for writing parsers.


If your input stream is processable by a regular expression and it isn't complex, then use a regex. A stream of records where each record has a slot and value can be processed pretty reasonably this way.

If the stream has arbitrarily nested records, doing it by regex is impractical (in fact impossible), and you should switch to using a BNF and parser generator.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜