开发者

Convert tar.gz to zip

I've got a large collection of gzipped archives on my Ubuntu webserver,开发者_开发问答 and I need them converted to zips. I figure this would be done with a script, but what language should I use, and how would I go about unzipping and rezipping files?


I'd do it with a bash(1) one-liner:

for f in *.tar.gz;\
do rm -rf ${f%.tar.gz} ;\
mkdir ${f%.tar.gz} ;\
tar -C ${f%.tar.gz} zxvf $f ;\
zip -r ${f%.tar.gz} $f.zip ;\
rm -rf ${f%.tar.gz} ;\
done

It isn't very pretty because I'm not great at bash(1). Note that this destroys a lot of directories so be sure you know what this does before doing it.

See the bash(1) reference card for more details on the ${foo%bar} syntax.


A simple bash script would be easiest, surely? That way you can just invoke the tar and zip commands.


the easiest solution on unix platforms may well be to use fuse and something like archivemount (libarchive), http://en.wikipedia.org/wiki/Archivemount .

/iaw


You can use node.js and tar-to-zip for this purpose. All you need to do is:

Install node.js with nvm if you do not have it.

And then install tar-to-zip with:

npm i tar-to-zip -g

And use it with:

tar-to-zip *.tar.gz

Also you can convert .tar.gz files to .zip programmatically. You should install async and tar-to-zip locally:

npm i async tar-to-zip

And then create converter.js with contents:

#!/usr/bin/env node

'use strict';

const fs = require('fs');
const tarToZip = require('tar-to-zip');
const eachSeries = require('async/eachSeries');
const names = process.argv.slice(2);

eachSeries(names, convert, exitIfError);

function convert(name, done) {
    const {stdout} = process;
    const onProgress = (n) => {
        stdout.write(`\r${n}%: ${name}`);
    };
    const onFinish = (e) => {
        stdout.write('\n');
        done();
    };

    const nameZip = name.replace(/\.tar\.gz$/, '.zip');    
    const zip = fs.createWriteStream(nameZip)
        .on('error', (error) => {
            exitIfError(error);
            fs.unlinkSync(zipPath);
        });

    const progress = true;
    tarToZip(name, {progress})
        .on('progress', onProgress)
        .on('error', exitIfError)
        .getStream()
        .pipe(zip)
        .on('finish', onFinish);
}

function exitIfError(error) {
    if (!error)
        return;

    console.error(error.message);
    process.exit(1);
}


Zipfiles are handy because they offer random access to files. Tar files only sequential.

My solution for this conversion is this shell script, which calls itself via tar(1) "--to-command" option. (I prefer that rather than having 2 scripts). But I admit "untar and zip -r" is faster than this, because zipnote(1) cannot work in-place, unfortunately.

#!/bin/zsh -feu

## Convert a tar file into zip:

usage() {
    setopt POSIX_ARGZERO
    cat <<EOF
    usage: ${0##*/} [+-h] [-v] [--] {tarfile} {zipfile}"

-v verbose
-h print this message
converts the TAR archive into ZIP archive.
EOF
    unsetopt POSIX_ARGZERO
}

while getopts :hv OPT; do
    case $OPT in
        h|+h)
            usage
            exit
            ;;
        v)
            # todo: ignore TAR_VERBOSE from env?
            # Pass to the grand-child process:
            export TAR_VERBOSE=y
            ;;
        *)
            usage >&2
            exit 2
    esac
done
shift OPTIND-1
OPTIND=1

# when invoked w/o parameters:
if [ $# = 0 ] # todo: or stdin is not terminal
then
    # we are invoked by tar(1)
    if [ -n "${TAR_VERBOSE-}" ]; then echo $TAR_REALNAME >&2;fi
    zip --grow --quiet $ZIPFILE -
    # And rename it:
    # fixme: this still makes a full copy, so slow.
    printf "@ -\n@=$TAR_REALNAME\n" | zipnote -w $ZIPFILE
else
    if [ $# != 2 ]; then usage >&2; exit 1;fi
    # possibly: rm -f $ZIPFILE
    ZIPFILE=$2 tar -xaf $1 --to-command=$0
fi


Here is a python solution based on this answer here:

import sys, tarfile, zipfile, glob

def convert_one_archive(file_name):
    out_file = file_name.replace('.tar.gz', '.zip')
    with tarfile.open(file_name, mode='r:gz') as tf:
        with zipfile.ZipFile(out_file, mode='a', compression=zipfile.ZIP_DEFLATED) as zf:
            for m in tf.getmembers():
                f = tf.extractfile( m )
                fl = f.read()
                fn = m.name
                zf.writestr(fn, fl)

for f in glob.glob('*.tar.gz'):
    convert_one_archive(f)


Here is script based on @Brad Campbell's answer that works on files passed as command arguments, works with other tar file types (uncompressed or the other compression types supported by tarfile), and handles directories in the source tar file. It will also print warnings if the source file contains a symlink or hardlink, which are converted to regular files. For symlinks, the link is resolved during conversion. This can lead to an error if the link target is not in the tar; this is also potentially dangerous from a security standpoint, so user beware.

#!/usr/bin/python

import sys, tarfile, zipfile, glob, re

def convert_one_archive(in_file, out_file):
    with tarfile.open(in_file, mode='r:*') as tf:
        with zipfile.ZipFile(out_file, mode='a', compression=zipfile.ZIP_DEFLATED) as zf:
            for m in [m for m in tf.getmembers() if not m.isdir()]:
                if m.issym() or m.islnk():
                    print('warning: symlink or hardlink converted to file')
                f = tf.extractfile(m)
                fl = f.read()
                fn = m.name
                zf.writestr(fn, fl)

for in_file in sys.argv[1:]:
    out_file = re.sub(r'\.((tar(\.(gz|bz2|xz))?)|tgz|tbz|tbz2|txz)$', '.zip', in_file)
    if out_file == in_file:
        print(in_file, '---> [skipped]')
    else:
        print(in_file, '--->', out_file)
        convert_one_archive(in_file, out_file)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜