开发者

Is it more memory efficient to store images in mysql as BLOB or as plain files somewhere?

I have a LAMP server with 256MB RAM (poor man's server in cloud). I have an app written to run on this machine. Currently people upload images and they go straight into mysql as BLOB.

There are concerns that this might be very memory consuming operation and we move over it to simple plain files. Can some one tell me if these concerns are valid? (Worth putting efforts into changing a lot of ode that's already written given that we will have sufficient RAM in next 6 months ?)

As a general rule when should we store images in DB and when as files开发者_如何学JAVA?


To read a BLOB in MySQL you need three times as much memory as it takes (it gets copied into several buffers).

So yes, reading a BLOB in MySQL consumes more memory than reading a file.


You should store them in the file system for several reasons:

  • The images are easily accessible via other apps (shell, FTP, www, etc...),
  • It's less resource intensive (including memory) to read them from the file system than from a database
  • If the database gets corrupted, the images are safe.
  • You also won't have tables bump up against their size limitations (determined by OS file size limitations) which slows them down (and making them require more resources to read).

The only time you should consider storing images in the DB is when they are used in transaction processing, and even then, there are numerous workarounds to that when storing in the file system.


To summarize:

Database Storage:

Pros:

  • Assures referential integrity
  • Easier backup strategy
  • Easier clustering (database cluster)

Cons:

  • Higher cost in memory usage and storage
  • Hard to scale
  • Additional code must be written to support HTTP caching
  • Requires a database and associated querying code

File System Storage

Pros:

  • Low memory footprint (more efficient)
  • Storage equal to file size
  • Easy retrieval and storage
  • Allows the web server to control caching

Cons:

  • Referential integrity not assured
  • Backups are not always in sync with database backups
  • Requires additional backup strategy

If referential integrity of your images is important, store them in the database. The advantages is that a backup of your database will always means that your rows and images are in sync. It does mean though that it is a bit more costly resource wise to store and retrieve.

If the images themselves are not that important, store them as files. It allows for fast and simple retrieval and storage. The downside of using files though is that your backup strategy becomes more complicated and your files will not always be in sync with your database rows.

I personally always store them in the database. For me, the rewards are greater then the cost. This is hardly always the case though and you should look at your application requirements to see which is best for you.

Some large websites are using BLOBs to store their website content. Flickr's use of BLOBs is actually well documented. To answer your question though, file storage is more memory efficient than database storage.


If it is for serving on a WebPage I would user plain file system with a link to the image file name in text format on DataBase. Apache and browsers usually do an amazing job on caching static files.

Even though in theory, you could achieve similar performance serving images from a database, the amount of work you need to do for it does not justify this selection given that the only advantage I can think of is a more cohesive database (with a simple DB dump you get ALL your data: images + data).


If you have a lot of files to keep track of, or they are very large, I'd store them as files. Especially if these files are to be accessed via the web, in which case you can offload all that effort from the SQL server and let the web server handle the transfer.

A good way to track images is to name them using the primary key, and then keep track of the original file name (if you need it) in the database. This way you can always know which file connects to which row. Also, if you have many files (thousands, millions,...), you might consider 'hashing' them into directories, so that 1-1000 are stored in /1, 1001-2000 ares stored in /2, etc. Some OS's see a bit of slowdown when you get a large number of files in a single directory.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜