开发者

Scala Parser - Message Length

I'm toying with Scala's Parser library. I am trying to write a parser for a format where a length is specified followed by a message of that len开发者_StackOverflow中文版gth. For example:

x.parseAll(x.message, "5helloworld") // result: "hello", remaining: "world"

I'm not sure how to do this using combinators. My mind first goes to:

def message = length ~ body

But obviously body depends on length, and I don't know how to do that :p

Instead you could just define a message Parser as a single Parser (not combination of Parsers) and I think that is doable (although I haven't looked if a single Parser can pull several elem?).

Anyways, I'm a scala noob, I just find this awesome :)


You should use into for that, or its abbreviation, >>:

scala> object T extends RegexParsers {
     |   def length: Parser[String] = """\d+""".r
     |   def message: Parser[String] = length >> { length => """\w{%d}""".format(length.toInt).r }
     | }
defined module T

scala> T.parseAll(T.message, "5helloworld")
res0: T.ParseResult[String] =
[1.7] failure: string matching regex `\z' expected but `w' found

5helloworld
      ^

scala> T.parse(T.message, "5helloworld")
res1: T.ParseResult[String] = [1.7] parsed: hello

Be careful with precedence when using it. If you add an "~ remainder" after the function above, for instance, Scala will interpret it as length >> ({ length => ...} ~ remainder) instead of (length >> { length => ...}) ~ remainder.


This does not sound like a context free language, so you will need to use flatMap :

def message = length.flatMap(l => bodyOfLength(n))

where length is of type Parser[Int] and bodyOfLength(n) would be based on repN, such as

def bodyWithLength(n: Int) : Parser[String] 
  = repN(n, elem("any", _ => true)) ^^ {_.mkString}


I wouldn´t use pasrer combinators for this purpose. But if you have to or the problem becomes more complex you could try this:

def times(x :Long,what:String) : Parser[Any] = x match {
case 1 => what;
case x => what~times(x-1,what);
}

Don´t use parseAll if you want something remained, use parse. You could parse length, store the result in a mutable field x(I know ugly, but useful here) and parse body x times, then you get the String parsed and the rest remains in the parser.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜