To store images in SQL or not?
Generally, I had thought it was always better to store images in the filesystem and link to it via the database entry. However, I am trying to optimize my db design and am having a few questions.
My images are all really small thumbmails in black and white (not greyscale, but true B&W) and are 70x70 in size. If we take the images (which is basically a 2D array of 1 and 0), it can be stored as binary data that would be approximately 600 bytes each.
So my question is whether querying the 600 bytes stored in a db would be faster than 开发者_运维问答querying a link followed by accessing the filesystem; assuming there are a lot of "image" queries being made.
Does anyone have any experience with this area?
If it matters, I am using MySQL, and MonetDB (separately, but have the same question for both).
Many thanks, Brett
If it's only 600 bytes then I wouldn't worry too much and would store them in the database as a blob
There's an interesting article at High Scalablilty about how Flickr is architected. This might prove to be a useful read for you.
Since you tagged the question sql-server then I recommend you to read To Blob or Not To Blob, a research paper by the regretted Jim Gray. The paper goes into plenty of detail on the topic of storing BLOBs in the file system (NTFS) vs. database (SQL Server), and you'll be amazed how many angles are considered. It's a MUST read. But the conclusion is this:
The study indicates that if objects are larger than one megabyte on average, NTFS has a clear advantage over SQL Server. If the objects are under 256 kilobytes, the database has a clear advantage. Inside this range, it depends on how write intensive the workload is, and the storage age of a typical replica in the system.
Your case falls clearly in the 'To BLOB' case.
As far as I understand, there's no problem to store even bigger files in the DB, if you're not using SELECT *
for no reason (frankly speaking, there's never a reason to use SELECT *
at all).
BLOBs and TEXTs are stored separately from other data and don't affect performance if not queried explicitly.
If you're talking of web application, storing images in the database just stupid. As you have no benefits that desktop application may gain, but only difficulties.
This is not matter of file size only but also about the maximum amount of records you are expecting to have when the database is working. Older times we used to make this kind of math for any type of field. Just multiply 600 bytes for the maximum amount of records and if the result is something manageable don't worry about the speed.
As @codeholic says if you're not using SELECT *, everything goes fine.
Storing the image in the database (and serving it on each request) prevents you from caching those images in a proxy server (or rather - complicates the task many-fold and bars almost all out-of-the-box solutions). The catch is that to measure the impact you need to look at it differently - instead of "how much time does a single query to fetch an image takes" ask yourself "how much time does a series of (put a reasonable number here) queries to get the same set of images takes". Maybe also ask yourself "must I pay the cost to roundtrip to the DB and back?".
Not that the idea is without merit - updating the images in a single location might be an important factor. Also, if high availability is a factor, it's much easier to configure with the DB as the central point of data (putting the images on the file system means synchronizing them between nodes when they're updated). Change tracking, permissions, additional data, avoiding "broken links" - these might play a part as well.
All of the above aside, I've had bad experience using the "filesystem" technique. Currently we're considering a move to the "database" technique.
精彩评论