How should I (intelligently) store and archive large xml files for a data import

2023-02-18 05:04 问答作者：

We've got a rails app that processes large amounts of xml data imports. Right now we're storing these ~5MB xml docs in Postgres. This is not ideal given that we use each xml doc once or twice for parsing. We'd like to have an intelligent way of storing and archiving these docs, but not overly complicate the retrieval process for the sake of space. We've consid开发者_如何学编程ered moving the docs to Mongo (which we're also using), but then aren't we just artificially boosting the memory requirements of our Mongo db servers?

What's the best way for us to deal with this?

I would just store a link to the file in the DB if you use it only for parsing once or twice and then load the file from the given link. Another aproach is to use a XML DB, e.g. eXist.

You could try eXist, an XML database. If you are just archiving them, though, why not just store them in a directory tree?

You may want to look into DB2's PureXML capabilities. To play with it, you can download the free DB2 Express-C version here. For the record, IBM is also the only database provider officially supporting their Ruby driver and Rails adapter, so you wouldn't be on your own.

What harm are they doing where they are? They will take up 'space' wherever you put them.

If are confident you will never need them again then there is a case for archival to less expensive storage (eg tape?) - otherwise whatever you do will 'overly complicate the retrieval process'

You could consider compressing them in-place if you are not already doing so

继续阅读：mongodb postgresql ruby-on-rails xml

How should I (intelligently) store and archive large xml files for a data import

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？