Document Management System - Database Design

2023-02-03 17:02 问答作者：

I'm writing my own Document Management System (DMS) in Java (the ones available don't satisfy my needs).

The documents shall be described by the Qualified DublinCore Metadata Standard. The easiest way to do this, in my opinion is do pack the key-value pairs in a RDF model with a XML representation.

To store the metadata for all documents i have two ideas (the document files will be stored in the filesystem):

Store all metadata of all documents in a single XML file
Make a XML file for each document and store it either in the filesystem or in a RDBMS (like the H2 database engine for Java), a key-value database won't solve this because the keys for one document are not unique.

Since (many) documents are linked among each other the first approach may would be better for analysing the data, but the second approach may be much faster.

Which solution you would recommend? Or are there any better solutions?

开发者_StackOverflow社区

Stefan

I don't know how your analysis work, but if you need the complete graph in memory to do your analysis then use variante 1 (Store all metadata of all documents in a single XML file), because you will get no gain (but only extra work) from variante 2 in this scenario.

added

If this extra work for variant 2 is not to much, then I recomend variant 2, because it can be more calable.

you could update or add document meta data by writing only a small xml file instead of a huge one
it depends on what xml parser you use, but in some cases it is faster to parse some smaller xml files than one huge one (but this strongly depends on the ammout of data).

Have you considered using MongoDB and GridFS? http://www.mongodb.org/display/DOCS/GridFS+Specification

You can store your documents directly in MongoDB as binary and even store the associated metadata for that particular file in any format you want. It would have the ability to store documents even if they have the same name and it will generate it's own unique IDs.

BTW: even if it does not belong to your question: have a look at a JCR (Java Content Repository) implementation like JackRabbit. You could use it to store your documents and maybe your meta data too.

I'd look into a NO SQL document solution like Couch DB to see if it could help you.

I don't like the file system solution; there's no abstraction whatsoever to help you there.

If your are always accessing all documents, none of your approaches would be slower than the other. But I would recommend the second approach. When it comes to analyzing the data, you'll need to read all documents, so there is no difference if they are in different files or in one file...

继续阅读：database-design document-management

Document Management System - Database Design

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？