How do I parse xml into "messages" and print them out in scala using stream parsing?
Now that I know how to parse xml in scala as a stream I need help understanding a non-trivial example.
I'd like to parse the following xml as a stream and send a message (print to console for this example) whene开发者_高级运维ver I've parsed out a full message.
I understand that stream based parsing in scala uses case classes to handle the different elements, but I'm just getting started and I don't quite understand how to do this.
I have this working in java using a stax parser, and I'm trying to translate that into scala.
Any help would be greatly appreciated.
<?xml version="1.0" ?>
<messages>
<message>
<to>john.doe@gmail.com</to>
<from>jane.doe@gmail.com</from>
<subject>Hi Nice</subject>
<body>Hello this is a truly nice message!</body>
</message>
<message>
<to>joe@gmail.com</to>
<from>jane.doe@gmail.com</from>
<subject>Hi Nice</subject>
<body>Hello this is a truly nice message!</body>
</message>
</messages>
This is for 2.8.
The typical way to process events is to use a match statement. In my case, i always had the need to store the parents as I process elements (to know for instance in what tag the text is located):
import scala.xml.pull._
import scala.io.Source
import scala.collection.mutable.Stack
val src = Source.fromString(xml)
val er = new XMLEventReader(src)
val stack = Stack[XMLEvent]()
def iprintln(s:String) = println((" " * stack.size) + s.trim)
while (er.hasNext) {
er.next match {
case x @ EvElemStart(_, label, _, _) =>
stack push x
iprintln("got <" + label + " ...>")
case EvElemEnd(_, label) =>
iprintln("got </" + label + ">")
stack pop;
case EvText(text) =>
iprintln(text)
case EvEntityRef(entity) =>
iprintln(entity)
case _ => // ignore everything else
}
}
Because entity are events, you will probably need to convert to text and combine them with the surrounding text.
In the example above I only used label, but you can also use EvElemStart(pre, label, attrs, scope)
to extract more stuff and you can add an if
guard to match for complex conditions.
Also if you're using 2.7.x, I don't know if http://lampsvn.epfl.ch/trac/scala/ticket/2583 was back-ported so, you may have issues to process text with entities.
More to the point, just dealing with from and to for brevity (though I would not call that the Scala way):
class Message() {
var to:String = _
var from:String = _
override def toString(): String =
"from %s to %s".format(from, to)
}
var message:Message = _
var sb:StringBuilder = _
while (er.hasNext) {
er.next match {
case x @ EvElemStart(_, "message", _, _) =>
message = new Message
case x @ EvElemStart(_, label, _, _) if
List("to", "from") contains label =>
sb = new StringBuilder
case EvElemEnd(_, "to") =>
message.to = sb.toString
case EvElemEnd(_, "from") =>
message.from = sb.toString
sb = new StringBuilder
case EvElemEnd(_, "message") =>
println(message)
case EvText(text) if sb != null =>
sb ++= text
case EvEntityRef(entity) =>
sb ++= unquote(entity) // todo
case _ => // ignore everything else
}
}
精彩评论