Can't unzip archive built with zipfile (Python)
I'm having problems with an archive that I built using zipfile in Python. I'm iterating over all the files in a directory and writing them to an archive. When I attempt to extract them afterward I get an exception related to the path separator.
the_path= "C:\\path\\to\\folder"
zipped= cStringIO.StringIO()
zf = zipfile.ZipFile(zipped_cache, "w", zipfile.ZIP_DEFLATED)
for dirname, subdirs, files in os.walk(the_path) :
for filename in files:
zf.write(os.path.join(dirname, filename), os.path.join(dirname[1+len(the_path):], filename))
zf.extractall("C:\\destination\\path")
zf.close()
zipped_cache.close()
Here's the exception:
zipfile.BadZipfile: File name in directory "env\index" and header "env/index" differ.
Update: I replaced the string buffer cStringIO.StringIO()
with a temporary file (tempfile.mkstemp("temp.zip")
) and now it works. There's something that happens when the zipfile module writes to the buffer that corrupts the archive, not sure what the problem is though.
The issue w开发者_开发技巧as that I was reading/writing the information from/into files that were open in "r"/"w" mode instead of "rb"/"wb". This isn't an issue in Linux, but it gave me errors in Windows due to character encoding. Solved.
You should consider adding an r before the string to indicate it is a raw string -- the backslashes in the path are being interpreted as escape characters.
The following code:
#!/bin/env python
print(r"C:\destination\path")
print(r"C:\path\to\folder")
print("C:\destination\path")
print("C:\path\to\folder")
produces the following output:
C:\destination\path
C:\path\to\folder
C:\destination\path
C:\path o
older
Note that the \t and \f are interpreted as tab and formfeed in the last line.
Interestingly, you could also change the backslashes to forward slashes (i.e. open("C:/path/to/folder"
), which would work.
Or, escape the backslashes with ... backslashes (i.e. open("C:\\path\\to\\folder")
).
IMO, the clearest and easiest solution is to simply add an r.
Edit: It looks like you need to go with the second solution, forward slashes. The zipfile library is kind of strict apparently -- and given that this is a window-only bug, it probably snuck through. (See Issue 6839).
Found the answer to my question here: http://www.penzilla.net/tutorials/python/scripting.
I'm pasting the two functions that are relevant to zipping up a directory. The problem was not the string buffer, nor the slashes, but the way I was iterating and writing to the zipfile. These 2 recursive functions fix the problem. Iterating over the entire tree of sub-directories with os.walk
is not a good way to write the archive.
def zippy(path, archive):
paths = os.listdir(path)
for p in paths:
p = os.path.join(path, p) # Make the path relative
if os.path.isdir(p): # Recursive case
zippy(p, archive)
else:
archive.write(p) # Write the file to the zipfile
return
def zipit(path, archname):
# Create a ZipFile Object primed to write
archive = ZipFile(archname, "w", ZIP_DEFLATED) # "a" to append, "r" to read
# Recurse or not, depending on what path is
if os.path.isdir(path):
zippy(path, archive)
else:
archive.write(path)
archive.close()
return "Compression of \""+path+"\" was successful!"
You need to escape the backslashes in your paths.
Try changing the following:
the_path= "C:\path\to\folder"
tothe_path = "C:\\path\\to\\folder"
, andzf.extractall("C:\destination\path")
tozf.extractall("C:\\destination\\path")
.
You can use forward slashes as path separators, even on Windows. I suggest trying that when you create the zip file.
精彩评论