HTML webpages to Wiki pages translator
I am looking for a HTML to wiki website translator. Basically I want to publish the coverage reports generated by cobertura to my google code website. But google code only suuports wiki pages, so if someone can point 开发者_开发问答me to a HTML website to wiki pages (linked together) translator I can publish my coverage reports.
There is a pretty good translator available here. It also supports the google code wiki syntax.
See if this can help you out.
I'm not familiar with any such translators, but it wouldn't be difficult for you to hack up a quick wiki markup DOM seralizer on your own as a last resort.
Just write a function to parse the HTML using a DOM parser (My favorite is the LXML Python binding for libxml2) and serialize to wiki markup via depth-first traversal and then wrap the whole thing in a ready-made spidering framework. (Or whip your own up. That's not too difficult either.)
Something like this Python code: (Using StackOverflow markup as the example)
tags = {
'b' : {'start': '**', 'end': '**'},
'em' : {'start': '*', 'end': '*'},
'i' : {'start': '*', 'end': '*'},
'strong' : {'start': '**', 'end': '**'},
// etc.
}
def serialize(node):
tag = tags.get(node.tag, {})
return ''.join([tag.get('start', ''), node.text or ''] +
[serialize(child) for child in node] +
[tag.get('end', ''), node.tail or ''])
wiki_markup = serialize(domRoot)
That took me maybe 5 minutes and I could probably implement the whole thing in under an hour.
I left out the more complicated bits for handling block markup (stuff where newlines, indentation, or line-starting characters are significant) and footnote-style link definition, but that's not very difficult... especially if you add an optional callback argument to the tag definition structure.
Really, the only time-consuming part is reinventing the Makefile-style "only update what's been changed" caching.
精彩评论