开发者

Amazon EC2 -- Read Data From S3?

I have many data files (let's call them input_files) that are stored in Amazon S3.

I would like to start about 15 independent Amazon EC2 linux instances. These instances should load the input_files (that are stored in S3) and process them independently.

I'd like all the 1开发者_开发百科5 independent Amazon EC2 linux instances to write to the same output file.

Upon completion, this output file will be saved in S3.

Two questions:

(1) Is it possible for Amazon EC2 linux instances to connect to S3 and read data from it?

(2) How can I arrange that all the 15 independent Amazon EC2 linux instances would write to the same output file? Can I have this file in S3, and all instances will write to it?


(1) Yes. You can access S3 from anywhere on the internet using the S3 public API

(2) You are describing a database it seems. S3 is simply a file store, you don't write to files on S3 - you save files to S3.

Maybe you should look into some type of database instead.


I suggest you to take a look at this : http://docs.aws.amazon.com/IAM/latest/UserGuide/role-usecase-ec2app.html

Imagine that you are an administrator who manages your organization's AWS resources. Developers in your organization have applications that run on Amazon EC2 instances. These applications require access to other AWS resources—for example, making updates to Amazon S3 buckets.

Applications that run on an Amazon EC2 instance must sign their AWS API requests with AWS credentials. One way to do this is for developers to pass their AWS credentials to the Amazon EC2 instance, allowing applications to use the credentials to sign requests.

However, when AWS credentials are rotated, developers have to update each Amazon EC2 instance that uses their credentials.

and to see how to do this with python: https://groups.google.com/forum/?fromgroups=#!topic/boto-users/RPoFskVw1gc

The basic procedure is as follows:

First, you have to create a JSON policy document that represents what services and resources the IAM role should have access to. for example, this policy grants all S3 actions for the bucket "my_bucket". You can use whatever policy is appropriate for your application. BUCKET_POLICY = """{

"Statement":[{

"Effect":"Allow",

"Action":["s3:*"],

"Resource":["arn:aws:s3:::my_bucket"]}]}"""

Next, you need to create an Instance Profile in IAM.

import boto

c = boto.connect_iam()

instance_profile = c.create_instance_profile('myinstanceprofile')

Once you have the instance profile, you need to create the role, add the role to the instance profile and associate the policy with the role.

role = c.create_role('myrole')

c.add_role_to_instance_profile('myinstanceprofile', 'myrole')

c.put_role_policy('myrole', 'mypolicy', BUCKET_POLICY)

Now, you can use that instance profile when you launch an instance:

ec2 = boto.connect_ec2() ec2.run_instances('ami-xxxxxxx', ..., instance_profile_name='myinstanceprofile')

And the new instance should have the appropriate role and credentials associated with it once it is launched.

there are same tutorials for Java, Ruby, ... Amazon website. you can refer to first url to see other tutorials.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜