To use a Blob or not to (mysql + coldfusion)

2023-03-01 08:05 问答作者：

I would like to know if storing pdfs in a database table is a good long term idea. Here is a description of the problem:

I have a customer that has hundreds of clients that upload numerous pdf files as proofs. These pdf files range from fairly small ( < 100K ) to 10MB. These files can potentially get uploaded multiple times as they are proofs for a single project (i.e. proof1.pdf, proof2.pdf etc..) PDFs for each customer must remain separate, and PDFs for each project must remain separate for each customer.

Currently it is set up where the files are uploaded directly to a folder created for each client for each project. This is OK but 开发者_运维技巧does take up space and finding files can be a bit of a nightmare. Like I said multiple proofs will be uploaded for each project and each customer.

The best solution I can think of is to provide an interface that will upload PDF files directly into a db table which keeps track of the customer id, project id, and proof. This provides much better security, and provides the ability to get all PDF files from each customer for project X.

A database cleanup tool will be developed to delete records that are older than a specified period of time, so the table will not continue to grow forever, but I am worried about the performance hit (if there is one) and other negatives that I might be overlooking.

So, overall is this a good idea or should I figure out a better way to handle this in the filesystem?

I would recommend storing lightweight keys that point to data in a filesystem, in lieu of storing the actual files' data in a BLOB field. One possible arrangement would be to hash your files (with, say, SHA-1) and use that hash as the filename on disk - possibly even arranging the storage into a directory tree that maps over the first n hash characters (i.e., 80cdef... might be stored in storage/8/0/c/d/80cdef...).

Your table then might consist of a primary key, a human-friendly display name for the file, and a field containing the (hash) name of the physical file on disk.

This also lends you the flexibility to physically separate the file storage from the database storage into, say, a distributed filesystem; this would be a rather reasonable separation to make in a long-term system that will inevitably grow very large in size. In this way, you retain the benefits of a relatively small database (potentially better performance and less backup pain) while offloading the more difficult problem of massive storage to a system that exists outside of the database itself, and for which there are already a plethora of proven approaches.

I tend to shy away from storing files in databases. I've worked with Blackboard installations on a campus and you can upload files in that application. As a result, the database grew to a size that was unmanageable, over 1TB. Blackboard's backup system packaged up each course as a zip file and to do a complete backup of a course, all of the files had to be pulled and compressed... this became a lengthy process. We had to split (and re-split) backups regularly.

Here is another post that comments on this: Stackoverflow post

继续阅读：blob coldfusion

To use a Blob or not to (mysql + coldfusion)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？