开发者

Using wget to download dokuwiki pages in plain xhtml format only

I'm currently modifying the offline-dokuwiki[1] shell script to get the latest documentation for an application for automatically embedding within instances of that application. This works quite well except in its current form it grabs three versions o开发者_开发技巧f each page:

  1. The full page including header and footer
  2. Just the content without header and footer
  3. The raw wiki syntax

I'm only actually interested in 2. This is linked to from the main pages by a html <link> tag in the <head>, like so:

<link rel="alternate" type="text/html" title="Plain HTML" 
href="/dokuwiki/doku.php?do=export_xhtml&amp;id=documentation:index" /> 

and is the same url as the main wiki pages only they contain 'do=export_xhtml' in the querystring. Is there a way of instructing wget to only download these versions or to automatically add '&do=export_xhtml' to the end of any links it follows? If so this would be a great help.

[1] http://www.dokuwiki.org/tips:offline-dokuwiki.sh (author: samlt)


DokuWiki accepts the do parameter as HTTP header as well. You could run wget with the parameter --header "X-DokuWiki-Do: export_xhtml"

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜