开发者

Backup strategy for user uploaded files on Amazon S3? [closed]

Closed. This question is off-topic. It is not currently accepting answers. 开发者_开发知识库

Want to improve this question? Update the question so it's on-topic for Stack Overflow.

Closed 9 years ago.

Improve this question

We're switching from storing all user uploaded files on our servers, to using Amazon S3. It's approx. 300 GB of files.

What is the best way to keep an backup of all files? I've seen a few different suggestions:

  • Copy bucket to a bucket in a different S3 location
  • Versioning
  • Backup to an EBS with EC2

Pros/cons? Best practice?


What is the best way to keep an backup of all files?

In theory, you don't need to. S3 has never lost a single bit in all these years. Your data is already stored in multiple data centers.

If you're really worried about accidentally deleting the files, use IAM keys. For each IAM user, disable the delete operation. And/or turn on versioning and remove the ability for an IAM user to do the real deletes.

If you still want a backup, EBS or S3 is pretty trivial to implement: Just run an S3 Sync utility to sync between buckets or to the EBS drive. (There are a lot of them, and it's trivial to write.) Note that you pay for unused space on your EBS drive, so it's probably more expensive if you're growing. I wouldn't use EBS unless you really had a use for local access to the files.

The upside of the S3 bucket sync is you can quickly switch your app to using the other bucket.

You could also use Glacier to backup your files, but that has some severe limitations.


IMHO, backup to another S3 bucket in another Availability Zone (hence Bucket) is the best way to go:

  • You already have the infrastructure to manipulate S3 so there is little change to do
  • This will ensure that in the event of a catastrophic failure of S3, your backup AZ won't be affected

Other solutions have drawbacks this doesn't have:

  • Versioning is not catastrophic failure proof
  • EBS backup requires specific implementation to manipulate these backups directly on the disk.


I didn't try it myself but Amazon have a versioning feature that could solve your backup fears - see: http://aws.amazon.com/about-aws/whats-new/2010/02/08/versioning-feature-for-amazon-s3-now-available/


  1. Copy bucket to a bucket in a different S3 location: This may not be necessary because S3 already has achieved six "9" reliable by redundancy backup. People who want to achieve data accessing performance globally might make copy of buckets in different data center. So, unless you want to avoid some unlikely disaster like "911", then you can make a copy in Tokyo data center for buckets in New York. However, within same data center, copying buckets to different buckets gives you very little help when disaster happens to same data center.
  2. Versioning It helps you achieve storage efficiency by saving redundancy and helps to restore faster. Definitely it is a good choice.
  3. Backup to an EBS with EC2 You probably will NEVER do this because EBS is a much expensive/faster storage in AWS compared with S3. And its main purpose is for backup EC2 image for faster bootup. EC2 is computing instance which has nothing to do with storage or S3. It is totally irrelevant and I cannot see any point that you introduce EC2 to your data backup.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜