开发者

Archiving thousands of files and 7zip limitations

My application requires that a task is run everyday in which 100,000+ PDF (~ 50kb each) files need to be zipped. Currently, I'm using 7-zip and calling 7za.exe (the command line tool with 7-zip) to zip each file (files are located in many different folders).

What are the limitations in this approach and how can they be solved? 开发者_开发知识库Is there a file size or file count limit for a 7zip archive?


The limit on file size is 16 exabytes, or 16000000000 GB.

There is no hard limit on the number of files, but there is a practical limit in how it manages the headers for the files. The exact limit depends on the path lengths but on a 32-bit system you'll run into limits somewhere around a million files.

I'm not sure if any other format supports more. Regular zip has far smaller limits.

http://en.wikipedia.org/wiki/7-Zip

One notable limitation of 7-Zip is that, while it supports file sizes of up to 16 exabytes, it has an unusually high overhead allocating memory for files, on top of the memory requirements for performing the actual compression.

Approximately 1 kilobyte is required per file (More if the pathname is very long) and the file listing alone can grow to an order of magnitude greater than the memory required to do the actual compression. In real world terms, this means 32-bit systems cannot compress more than a million or so files in one archive as the memory requirements exceed the 2 GB process limit.

64-bit systems do not suffer from the same process size limitation, but still require several gigabytes of RAM to overcome this limitation. Archives created on such systems would be unusable on machines with less memory however.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜