How to simulate ZipFile.open in Python 2.5?

2023-01-17 03:16 问答作者：

I want to extract a file from a zip to a specific path, ignoring the file path in the archive. This is very easy in Python 2.6 (my docstring is longer than the code)

import shutil
import zipfile

def extract_from_zip(name, dest_path, zip_file):
    """Similar to zipfile.ZipFile.extract but extracts the file given by name
    from the zip_file (instance of zipfile.ZipFile) to the given dest_path
    *ignoring* the filename path given in the archive completely
    instead of preserving it as extract does.
    """
    dest_file = open(dest_path, 'wb')
    archived_file = zip_file.open(name)
    shutil.copyfileobj(archived_file, dest_file)


 extract_from_zip('path/to/file.dat', 'output.txt', zipfile.ZipFile('test.zip', 'r'))

But in Python 2.5, The ZipFile.open method is not available. I couldn't find a solution on stackoverflow, but this forum post had a nice solution that makes use of the ZipInfo.file_offset to seek to the right point in the zip and use zlib.decompressobj to unpack the bytes from there. Unfortunately ZipInfo.file_offset was removed in Python 2.5!

So, given that all we have in Python 2.5 is the ZipInfo.header_offset, I figured I'd just have to parse and skip over the header structure to get to the file offset myself. Using Wikipedia as a reference (I know) I came up with this much longer and not very elegant solution.

import zipfile
import zlib

def extract_from_zip(name, dest_path, zip_file):
    """Python 2.5 version :("""
    dest_file = open(dest_path, 'wb')
    info = zip_file.getinfo(name)
    if info.compress_type == zipfile.ZIP_STORED:
        decoder = None
    elif info.compress_type == zipfile.ZIP_DEFLATED:
        decoder = zlib.decompressobj(-zlib.MAX_WBITS)
    else:
        raise zipfile.BadZipFile("Unrecognized compression method")

    # Seek over the fixed size fields to the "file name length" field in
    # the file header (26 bytes). Unpack this and the "extra field length"
    # field ourselves as info.extra doesn't seem to be the correct length.
    zip_file.fp.seek(info.header_offset + 26)
    file_name_len, extra_len = struct.unpack("<HH", zip_file.fp.read(4))
    zip_file.fp.seek(info.header_offset + 30 + file_name_len + extra_len)

    bytes_to_read = info.compress_size

    while True:
        buff = zip_file.fp.read(min(bytes_to_read, 102400))
        if not buff:
            break
        bytes_to_read -= len(buff)
        if decoder:
            buff = decoder.decompress(buff)
        dest_file.write(buff)

    if decoder:
        dest_file.write(decoder.decompress('Z'))
      开发者_Python百科  dest_file.write(decoder.flush())

Note how I unpack and read the field that gives the length of the extra field, because calling len on the ZipInfo.extra attribute gives 4 bytes less, thus causing the offset to be calculated incorrectly. Perhaps I'm missing something here?

Can anyone improve on this solution for Python 2.5?

Edit: I should have said, the obvious solution as suggested by ChrisAdams

dest_file.write(zip_file.read(name))

will fail with MemoryError for any reasonably sized file contained in the zip because it tries to slurp the whole file into memory in one go. I have large files, so I need to stream out the contents to disk.

Also, upgrading Python is the obvious solution, but one that is entirely out of my hands and essentially impossible.

Haven't tested this bit, but I use something extremely similar in Python 2.4

import zipfile

def extract_from_zip(name, dest_path, zip_file):
    dest_file = open(dest_path, 'wb')
    dest_file.write(zip_file.read(name))
    dest_file.close()

extract_from_zip('path/to/file/in/archive.dat', 
        'output.txt', 
        zipfile.ZipFile('test.zip', 'r'))

I know I am a bit late to the party for this question, but was having the exact same problem.

The solution I used was to copy the python 2.6.6 version of zipfile and put in a folder (I called it python_fix) and import that instead:

python_fix/zipfile.py

Then in code:

import python_fix.zipfile as zipfile

From there I was able to use the 2.6.6 version of zipfile with the python 2.5.1 interpreter (the 2.7.X versions fail on the "with" with this version")

Hope this helps someone else using ancient technology.

Given my constraints, it looks like the answer was given in my question: parse the ZipFile structure yourself and use zlib.decompressobj to unzip the bytes once you've found them.

If you don't have (/suffer from) my constraints, you can find better answers here:

If you can, just upgrade Python 2.5 to 2.6 (or later!), as suggested in a comment by Daenyth.
If you only have small files in the zip which can be 100% loaded in memory, use ChrisAdams' answer
If you can introduce a dependency on an external utility, make an appropriate system call to /usr/bin/unzip or similar, as suggested in Vlad's answer

继续阅读：python python-2.5 zip

How to simulate ZipFile.open in Python 2.5?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？