Scala Parsers: Availability, Differences and Combining?
My question is about the Scala Parsers:
- Which ones are available (in the Standard library and outside),
- what's the difference between them,
- do they share a common API and
- can different Parsers be combined to parse one input string?
I found at least these:
开发者_开发问答- Scala's "standard" parser (seems to be an LL parser)
- Scala's Packrat parser (since 2.8, is a LALR parser)
- The Parboiled parser (PEG parser?)
- Spiewak's GLL parser combinator
There's also Dan Spiewak's implementation of GLL parser combinators.
It's worth noting that Scala's standard parser combinators are not LL, nor are Packrat combinators LALR. Parser combinators are a form of recursive descent with infinite backtracking. You can think of them a bit like "LL(*)". The class of languages supported by this technique is precisely the class of unambiguous context-free languages, or the same class as LALR(1) and Packrat. However, the class of grammar is quite a bit different, with the most notable weakness being non-support for left-recursion.
Packrat combinators do support left-recursion, but they still fail to support many other, more subtle features of LALR. This weakness generally stems from the ordered choice operator, which can lead to some devilishly tricky grammar bugs, as well as prevents certain nice grammatical formulations. The most often-seen example of these bugs happens when you accidentally order ambiguous choices as shortest first, resulting in a greedy match that prevents the correct branch from ever being tried. LALR doesn't have this problem, since it simply tries all possible branches at once, deferring the decision point until the end of the production.
There is also a new approach known as "parsing with derivatives". The approach is described here. There is an implementation in Scala by Daniel Spiewak.
Just wanted to update this answer with a pointer to the latest iteration of the parboiled project, called parboiled2:
https://github.com/sirthias/parboiled2
parboiled2 targets only Scala (as opposed to Scala + Java), makes use of Scala macros, and is very actively maintained.
精彩评论