开发者

MongoDB, Carrierwave, GridFS and prevention of files' duplication

I am dealing with Mongoid, carrierwave and gridFS to store my uploads.

For example, I have a model Article, containing a file upload(a picture).

class Article
  include Mongoid::Document
  field :title, :type => String
  field :content, :type => String
  mount_uploader :asset, AssetUploader
end

But I would like to only store the file once, in the case where I'll upload many times the same file for differents articles.

I saw GridFS has a MD5 checksum.

What would be the best way to prevent duplication of identicals files ?

EDIT: In fact, on my website the users would be able to upload files. But to avoid to store multiple of identical files, I would like to just make links throught an associatio开发者_StackOverflow社区n table. Nothing of difficult, but how to do this the libraries specified below. If you have any idea.

Thanks


De-duplication may very well be a worthy goal depending on your application, but my first instinct to approaching this problem would be to turn it around -- why do you expect a lot of duplicate uploads? Can you reduce that likelihood so that users don't have to spend needless time uploading and you don't have to spend needless effort processing checks for duplicates?

What if you create an Asset model and attach the uploader to that, then an Article references_one :asset, and you let users choose from already available assets when creating a new article or upload a new one if needed?

I may not understand your application domain if you're giving a simplified example (please explain further if so), and it's certainly possible that duplication could still be a real issue, but I'd start by asking why significant duplication is expected, and next gather some data about how much of a problem it really is in your app and dataset before expending a lot of effort to address it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜