开发者

Automating scraping of table data to XML

Problem

I have a YQL query result that I'm trying to get converted and sort into a clean XML file.

Background

Being the pains that they are, information from the World Cup isn't freely available in an easy to reuse format.

So, after a bit of finessing with YQL I have managed to liberate the required table rows which contain the data I'm after.

The YQL query can be viewed at: http://query.yahooapis.com/v1/public/yql/ravingbeefsteak/worldcup2010groupliberator?diagnostics=true

The result of this is a whole bunch of table rows (view source from within your browser to see this).

I'd like to now take these resulting table rows, and convert them into an XML file, and being an absolute n00b I don't know where to start or what to look for.

The file could also use some struct开发者_如何学JAVAure to it, so part of working this out will involve creating that XML structure which I envision would look something like:

<teams>
  <team>
    <name>X</name>
    <webpage>X</webpage>
    <flagsrc>X</flagsrc>
    ...
  </team>
</teams>

I'm also needing to do a find and replace on the data (to what would become the data inside the XML team/webpage & team/flagsrc elements) to prepend addition data to these fields without manual intervention.

If anyone can point me in the right direction of what I need to be doing to make my needs a reality it would be greatly appreciated.


Am I missing something? The document linked to is already an XML document.

If you want to transform the data to another XML format, look at XSLT. I'd give more information but you did not indicate what platform you are on.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜