Difference betwean RegexpParsers,StandardTokenParsers and JavaTokenParsers in scala

2022-12-28 22:59 问答作者：

I am learning Parser Combinators in scala and seeing different ways of parsing.I mainly see three different kind of parsers ie.RegexpParsers,StandardTokenParsers and JavaTokenParsers.I am new to parsing and not getting the idea how we will choose the suitable Parser according to our requirement.Can any one please explain how these different parsers work and when 开发者_运维知识库to use them.

There are several different parser traits and base classes for different purposes.

The main trait is scala.util.parsing.combinator.Parsers. This has most of the main combinators like opt, rep, elem, accept, etc. Definitely look over the documentation for this one, since this is most of what you need to know. The actual Parser class is defined as an inner class here, and that's important to know about, too.

Another important trait is scala.util.parsing.combinator.lexical.Scanners. This is the base trait for parsers which read a stream of characters and produce a stream of tokens (also known as lexers). In order to implement this trait, you need to implement a whitespace parser, which reads whitespace characters, comments, etc. You also need to implement a token method, which reads the next token. Tokens can be whatever you want, but they must be a subclass of Scanners.Token. Lexical extends Scanners and StdLexical extends Lexical. The former provides some useful basic operations (like digit, letter), while the latter actually defines and lexes common tokens (like numeric literals, identifiers, strings, reserved words). You just have to define delimiters and reserved, and you will get something useful for most languages. The token definitions are in scala.util.parsing.combinator.token.StdTokens.

Once you have a lexer, you can define a parser which reads a stream of tokens (produced by the lexer) and generates an abstract syntax tree. Separating the lexer and parser is a good idea since you won't need to worry about whitespace or comments or other complications in your syntax. If you use StdLexical, you may consider using scala.util.parsing.combinator.syntax.StdTokenPasers which has parsers built in to translate tokens into values (e.g., StringLit into String). I'm not sure what the difference is with StandardTokenParsers. If you define your own token classes, you should just use Parsers for simplicity.

You specifically asked about RegexParsers and JavaTokenParsers. RegexParsers is a trait which extends Parsers with one additional combinator: regex, which does exactly what you would expect. Mix in RegexParsers to your lexer if you want to use regular expressions to match tokens. JavaTokenParsers provides some parsers which lex tokens from Java syntax (like identifiers, integers) but without the token baggage of Lexical or StdLexical.

To summarise, you probably want two parsers: one which reads characters and produces tokens, and one which takes tokens and produces an AST. Use something based on Lexical or StdLexical for the first. Use something based on Parsers or StdTokenParsers for the second depending on whether you use StdLexical.

RegexpParsers allow you to use RE values (typically in the form "re pattern".r but equally any other Regex instance). There are no pre-defined lexical productions (tokens).

JavaTokenParsers defines lexical productions for Java tokens: decimalNumber, floatingPointNumber, stringLiteral, wholeNumber, ident (identifier).

StandardTokenParsers defines lexical productions "... for a simple, Scala-like language. It parses keywords and identifiers, numeric literals (integers), strings, and delimiters." Its constituents are actually defined in StdLexical.

继续阅读：parsing scala

Difference betwean RegexpParsers,StandardTokenParsers and JavaTokenParsers in scala

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？