开发者

wikitext to xml

Is there a way to convert wikitext data into simple XML in a Java application?

Input example:

  == A section ==
  this is some text...           

{{MyTemplate
|attr1=some value
|attr2=some other value
      ...

Output example:

<section title='A section'>this is some text...</section>
<ValueDescription attr1='some value' attr2='some开发者_StackOverflow other value' ...>

It seems like a trivial task but I couldn't find a library to do it in Java.

Mulone


XML has a tree structure, wikitext for the most part does not. E. g. this is fully legal:

== A section {{DoubleEqual{{echo|Sign}}}}

The template syntax itself is hierarchical, and MediaWiki itself transforms it to XML (you can use Special:ExpandTemplates to check it out), but the rest of the syntax is much too loose for XML or other formal descriptions like a context-free grammar.

There is a rewrite effort going on to turn wikitext into a standard, parseable language, but don't expect it to end anytime soon.


http://sweble.org/wiki/Wikitext-parser/ they have a properly done parser, but I think there is no XML output for the AST yet.

@Tgr: Syntactically it is not really compatible with a Tree but semantically it is.

And yes, handling Wikitext is a huge mess.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜