Resuming an s3 bucket listing via boto
I'm iterating over 2 million objects thusly: -
conn = boto.connect_s3('xxx','xxx')
bucket = conn.lookup('bucket_name')
for key in bucket.list():
somefunction(key.name)
Say it fails at the millionth object, how would I go about resuming this开发者_高级运维 operation from that point?
I figured it out by looking at the boto source.
def list(self, prefix='', delimiter='', marker='', headers=None):
Passing key.name to marker will allow you to resume your operation from that point.
An example of resuming requests using the marker
property.
This is also useful if you want to recurse through subtrees or have many millions of objects to crawl through and don't want them in a single list.
marker = None
while True:
keys = bucket.get_all_keys(marker=marker)
last_key = None
for k in keys:
# TODO Do something with your keys!
last_key = k.name
if not keys.is_truncated:
break
marker = last_key
精彩评论