开发者

Parsing an xml file and storing it into a database

Is there a generic/automatic way in R or in python to parse xml files with its nodes and attributes, automatically generate mysql tables for storing that informatio开发者_如何学Cn and then populate those tables.


Regarding

Is there a generic/automatic way in R to parse xml files with its nodes and attributes, automatically generate mysql tables for storing that information and then populate those tables.

the answer is a good old yes you can, at least in R.

The XML package for R can read XML documents and return R data.frame types in a single call using the xmlToDataFrame() function.

And the RMySQL package can transfer data.frame objects to the database in a single command---including table creation if need be---using the dbWriteTable() function defined in the common DBI backend for R and provided for MySQL by RMySQL.

So in short: two lines can do it, so you can easily write yourself a new helper function that does it along with a commensurate amount of error checking.


They're three separate operations: parsing, table creation, and data population. You can do all three with python, but there's nothing "automatic" about it. I don't think it's so easy.

For example, XML is hierarchical and SQL is relational, set-based. I don't think it's always so easy to get a good relational schema for every single XML stream you can encounter.


There's the XML package for reading XML into R, and the RMySQL package for writing data from R into MySQL.

Between the two there's a lot of work. XML surpasses the scope of a RDBMS like MySQL so something that could handle any XML thrown at it would be either ridiculously complex or trivially useless.


We do something like this at work sometimes but not in python. In that case, each usage requires a custom program to be written. We only have a SAX parser available. Using an XML decoder to get a dictionary/hash in a single step would help a lot.

At the very least you'd have to tell it which tags map to which to tables and fields, no pre-existing lib can know that...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜