Comparing uncompressed local files to compressed files stored on Amazon S3?

2023-01-04 20:13 问答作者：

We put hundreds of image files on Amazon S3 that our users need to synchronize to their local directories. In order to save storage space and bandwidth, we zip the files stored on S3.

On the user's end they have a python script that runs every 5 min to get a current list of files, and download new/updated files.

My question is what's开发者_JS百科 the best way determine what is new or changed to download?

Currently we add an additional header that we put with the compressed file which contains the MD5 value of the uncompressed file...

We start with a file like this:

image_file_1.tif   17MB    MD5 = xxxx1234

We compress it (with 7zip) and put it to S3 (with Python/Boto):

image_file_1.tif.z  9MB    MD5 = yyy3456    x-amz-meta-uncompressedmd5 = xxxx1234

The problems is we can't get a large list of files from S3 that include the x-amz-meta-uncompressedmd5 header without an additional API for EACH one (SLOW for hundreds/thousands of files).

Our most practical solution is have users get a full list of files (without the extra headers), download the files that do not exist locally. If it does exist locally, then do and additional API call to get the full headers to compare local MD5 checksum against x-amz-meta-uncompressedmd5.

I'm thinking there must be a better way.

You could include the MD5 hash of the uncompressed image into the compressed filename.

So image_file_1.tif could become image_file_1.xxxx1234.tif.z

Your user python file which does the synchronising would therefore have the information needed to determine if it needed to go get the file again from S3, and could either strip out the MD5 part of the filename, or maintain it, depending on what you wanted to do.

Or, you could also maintain, on S3, a single file containing the full file list including the MD5 metadata. So the python script just need to fetch that single file, parse that, and then decide what to do.

继续阅读：boto

Comparing uncompressed local files to compressed files stored on Amazon S3?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？