开发者

How to make images hosted on Amazon S3 less public but not completely private?

I fired up a sample application that uses Amazon S3 for image hosting. I managed to coax it into working. The application is hosted at github.com. The application lets you create users with a profile photo. When you upload the photo, the web application store开发者_开发百科s it on Amazon S3 instead of your local file system. (Very important if you host at heroku.com)

However, when I did a "view source" in the browser of the page I noticed that the URL of the picture was an Amazon S3 URL in the S3 bucket that I assigned to the app. I cut & pasted the URL and was able to view the picture in the same browser, and in in another browser in which I had no open sessions to my web app or to Amazon S3.

Is there any way that I could restrict access to that URL (and image) so that it is accessible only to browsers that are logged into my applications?

Most of the information I found about Amazon ACLs only talk about access for only the owner or to groups of users authenticated with Amazon or AmazonS3, or to everybody anonymously.

EDIT----UPDATE July 7, 2010

Amazon has just announced more ways to restrict access to S3 objects and buckets. Among other ways, you can now restrict access to an S3 object by qualifying the HTTP referrer. This looks interesting...I can't wait until they update their developer documents.


For files where privacy actually matters, we handle this as follows:

  • Files are stored with a private ACL, meaning that only an authorized agent can download (or upload) them
  • To access a file, we link to http://myapp.com/download/{s3-path}, where download corresponds to a controller (in the MVC sense)
  • ACLs are implemented as appropriate so that only logged-in users can access that controller/action
  • That controller downloads the file using the API, then streams it out to the user with correct mime-type, cache headers, file size, etc.

Using this method, you end up using a lot more bandwidth than you need, but you still save on storage. For us this works out, because we tend to run out of storage much more quickly than bandwidth.

For files where privacy only sort of matters, we generate a random hash that we use for the URL. This is basically security through obscurity, and you have to be careful that your hash is sufficiently difficult to guess.

However, when I did a "view source" in the browser of the page I noticed that the URL of the picture was an Amazon S3 URL in the S3 bucket that I assigned to the app. I cut & pasted the URL and was able to view the picture in the same browser, and in in another browser in which I had no open sessions to my web app or to Amazon S3.

Keep in mind that this is no different than any image stored elsewhere in your document root. You may or may not need the kind of security you're looking for.


Amazon's Ruby SDK (https://github.com/aws/aws-sdk-ruby) has useful methods that make it a snap to get this done. "url_for" can generate a temporary readable URL for an otherwise private S3 object.

Here's how to create a readable URL that expires after 5 minutes:

object = AWS::S3.new.buckets['BUCKET'].objects['KEY']

object.url_for(:read, :expires => 300).to_s

AWS documentation: http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/S3/S3Object.html#url_for-instance_method


S3 is a separate service and does not know about your sessions.

The generic solution is to recognize the benefits and security properties that assigning each asset a separate, unique, and very long and random key, which forms part of the URL to that asset. If you so choose, you can even assign a key with 512 effective bits of randomness, and that URL will remain unguessable for a very long time.

  • Because someone who at time t has access to an asset can simply copy the asset for future reference, it makes sense to permit that person to know the URL and access the asset at any time.
  • Likewise, since that person can simply download the asset and distribute it to others, it makes sense to permit that person to distribute the URL to others to whom he would otherwise simply have distributed the asset itself.
  • Since all such access is read-only, and since writes are restricted to the website servers, there is no risk of malicious "hacking" from anyone who has this access.

You have to determine if this is sufficient security. If it isn't, then maybe S3 isn't for you, and maybe you need to store your images as binary columns in your database and cache them in memcached, which you can do on Heroku.


I think the best you can do is what drop.io does. While the data is in principle accessible to anyone, you give it a large and random URL. Anyone who knows the URL can access it, but your application controls who gets to see the URL.

Kind of security through obscurity.

You can think of it as the password included in the URL. This means that if you are serious about security, you have to treat the URL as confidential information. You have to make sure that these links do not leak to search engines, too.

It is also tricky to revoke access rights. The only thing you can do is invalidate a URL and assign a new one.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜