开发者

C# and Zip file manipulation

Here is what I am looking for:

I need to open a zip file of images and iterate through it's contents. First of all, the zip container file has subdirectories and inside one "IDX" houses the images I need. I have no problem extracting the zip file contents to a directory. My zip files can be incredibly huge, as in GBs huge, and so I am hoping to be able to open the file and pull out the images as I iterate through them one at a time to process them.

After I am done I just c开发者_StackOverflow中文版lose the zip file. These images are actually being housed in a database.

Does anyone have any idea how to do this with, hopefully, free tools or built-in api's? This process will be done on a Windows machine.

Thanks!


SharpZipLib is a great tool for your requirements.

I have used it to process giant files within directories within giant nested zip files (meaning ZIP files within ZIP files), using streams. I was able to open a zip stream on top of a zip stream so that I could investigate the contents of the inner zip without having to extract the entire parent. You can then use a stream to peek at the content files, which may help you determine whether you want to extract it or not. It's open-source.

EDIT: Directory handling in the library is not ideal. As I recall, it contains separate entries for some directories, while others are implied by the paths of the file entries.

Here's an extract of the code I used to collect the actual file and folder names at a certain level (_startPath). Let me know if you're interested in the whole wrapper class.

// _zipFile = your ZipFile instance
List<string> _folderNames = new List<string>();
List<string> _fileNames = nwe List<string>();
string _startPath = "";
const string PATH_SEPARATOR = "/";

foreach ( ZipEntry entry in _zipFile )
{
    string name = entry.Name;

    if ( _startPath != "" )
    {
        if ( name.StartsWith( _startPath + PATH_SEPARATOR ) )
            name = name.Substring( _startPath.Length + 1 );
        else
            continue;
    }

    // Ignore items below this folder
    if ( name.IndexOf( PATH_SEPARATOR ) != name.LastIndexOf( PATH_SEPARATOR ) )
        continue;

    string thisPath = null;
    string thisFile = null;

    if ( entry.IsDirectory ) {
        thisPath = name.TrimEnd( PATH_SEPARATOR.ToCharArray() );
    }
    else if ( entry.IsFile )
    {
        if ( name.Contains( PATH_SEPARATOR ) )
            thisPath = name.Substring( 0, name.IndexOf( PATH_SEPARATOR ) );
        else
            thisFile = name;
    }

    if ( !string.IsNullOrEmpty( thisPath ) && !_folderNames.Contains( thisPath ) )
        _folderNames.Add( thisPath );

    if ( !string.IsNullOrEmpty( thisFile ) && !_fileNames.Contains( thisFile ) )
        _fileNames.Add( thisFile );
}


There are at least two more viable options besides SharpZipLib (which works fine):

  • DotNetZip on Codeplex

  • Microsoft seems to be investigating integrating ZIP functionality into the System.IO namespace - see this blog post for more info


.NET doesn't provide a way to read the contents of a standard ZIP file. The System.IO.Packaging.ZipPackage class can create and read zip files that include a special manifest. ZipPackage can't read files that do not include this file although zip utilities can easily read a .zip created by ZipPackage. If you are the one creating the zips, ZipPackage may be an option. The classes used to perform the actual compression and creation of the .zip file are internal to System.IO.Packaging so you can't use it directly.

To convince your people that there is no OOTB way to open standard zips, you should mention that .NET also provides the System.IO.Compression.GZipStream class which only (de)compresses the contents of a file stream. It does not interpret them to separate files, directories etc.

Jon Galloway covered all the options a while back in "Creating Zip archives in .NET (without an external library)", although no option as clean as the upcoming System.IO.Zip.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜