Convert tar.gz to zip
I've got a large collection of gzipped archives on my Ubuntu webserver,开发者_开发问答 and I need them converted to zips. I figure this would be done with a script, but what language should I use, and how would I go about unzipping and rezipping files?
I'd do it with a bash(1)
one-liner:
for f in *.tar.gz;\
do rm -rf ${f%.tar.gz} ;\
mkdir ${f%.tar.gz} ;\
tar -C ${f%.tar.gz} zxvf $f ;\
zip -r ${f%.tar.gz} $f.zip ;\
rm -rf ${f%.tar.gz} ;\
done
It isn't very pretty because I'm not great at bash(1)
. Note that this destroys a lot of directories so be sure you know what this does before doing it.
See the bash(1)
reference card for more details on the ${foo%bar}
syntax.
A simple bash script would be easiest, surely? That way you can just invoke the tar
and zip
commands.
the easiest solution on unix platforms may well be to use fuse and something like archivemount (libarchive), http://en.wikipedia.org/wiki/Archivemount .
/iaw
You can use node.js and tar-to-zip for this purpose. All you need to do is:
Install node.js with nvm if you do not have it.
And then install tar-to-zip
with:
npm i tar-to-zip -g
And use it with:
tar-to-zip *.tar.gz
Also you can convert .tar.gz
files to .zip
programmatically.
You should install async
and tar-to-zip
locally:
npm i async tar-to-zip
And then create converter.js
with contents:
#!/usr/bin/env node
'use strict';
const fs = require('fs');
const tarToZip = require('tar-to-zip');
const eachSeries = require('async/eachSeries');
const names = process.argv.slice(2);
eachSeries(names, convert, exitIfError);
function convert(name, done) {
const {stdout} = process;
const onProgress = (n) => {
stdout.write(`\r${n}%: ${name}`);
};
const onFinish = (e) => {
stdout.write('\n');
done();
};
const nameZip = name.replace(/\.tar\.gz$/, '.zip');
const zip = fs.createWriteStream(nameZip)
.on('error', (error) => {
exitIfError(error);
fs.unlinkSync(zipPath);
});
const progress = true;
tarToZip(name, {progress})
.on('progress', onProgress)
.on('error', exitIfError)
.getStream()
.pipe(zip)
.on('finish', onFinish);
}
function exitIfError(error) {
if (!error)
return;
console.error(error.message);
process.exit(1);
}
Zipfiles are handy because they offer random access to files. Tar files only sequential.
My solution for this conversion is this shell script, which calls itself via tar(1) "--to-command" option. (I prefer that rather than having 2 scripts). But I admit "untar and zip -r" is faster than this, because zipnote(1) cannot work in-place, unfortunately.
#!/bin/zsh -feu
## Convert a tar file into zip:
usage() {
setopt POSIX_ARGZERO
cat <<EOF
usage: ${0##*/} [+-h] [-v] [--] {tarfile} {zipfile}"
-v verbose
-h print this message
converts the TAR archive into ZIP archive.
EOF
unsetopt POSIX_ARGZERO
}
while getopts :hv OPT; do
case $OPT in
h|+h)
usage
exit
;;
v)
# todo: ignore TAR_VERBOSE from env?
# Pass to the grand-child process:
export TAR_VERBOSE=y
;;
*)
usage >&2
exit 2
esac
done
shift OPTIND-1
OPTIND=1
# when invoked w/o parameters:
if [ $# = 0 ] # todo: or stdin is not terminal
then
# we are invoked by tar(1)
if [ -n "${TAR_VERBOSE-}" ]; then echo $TAR_REALNAME >&2;fi
zip --grow --quiet $ZIPFILE -
# And rename it:
# fixme: this still makes a full copy, so slow.
printf "@ -\n@=$TAR_REALNAME\n" | zipnote -w $ZIPFILE
else
if [ $# != 2 ]; then usage >&2; exit 1;fi
# possibly: rm -f $ZIPFILE
ZIPFILE=$2 tar -xaf $1 --to-command=$0
fi
Here is a python solution based on this answer here:
import sys, tarfile, zipfile, glob
def convert_one_archive(file_name):
out_file = file_name.replace('.tar.gz', '.zip')
with tarfile.open(file_name, mode='r:gz') as tf:
with zipfile.ZipFile(out_file, mode='a', compression=zipfile.ZIP_DEFLATED) as zf:
for m in tf.getmembers():
f = tf.extractfile( m )
fl = f.read()
fn = m.name
zf.writestr(fn, fl)
for f in glob.glob('*.tar.gz'):
convert_one_archive(f)
Here is script based on @Brad Campbell's answer that works on files passed as command arguments, works with other tar file types (uncompressed or the other compression types supported by tarfile), and handles directories in the source tar file. It will also print warnings if the source file contains a symlink or hardlink, which are converted to regular files. For symlinks, the link is resolved during conversion. This can lead to an error if the link target is not in the tar; this is also potentially dangerous from a security standpoint, so user beware.
#!/usr/bin/python
import sys, tarfile, zipfile, glob, re
def convert_one_archive(in_file, out_file):
with tarfile.open(in_file, mode='r:*') as tf:
with zipfile.ZipFile(out_file, mode='a', compression=zipfile.ZIP_DEFLATED) as zf:
for m in [m for m in tf.getmembers() if not m.isdir()]:
if m.issym() or m.islnk():
print('warning: symlink or hardlink converted to file')
f = tf.extractfile(m)
fl = f.read()
fn = m.name
zf.writestr(fn, fl)
for in_file in sys.argv[1:]:
out_file = re.sub(r'\.((tar(\.(gz|bz2|xz))?)|tgz|tbz|tbz2|txz)$', '.zip', in_file)
if out_file == in_file:
print(in_file, '---> [skipped]')
else:
print(in_file, '--->', out_file)
convert_one_archive(in_file, out_file)
精彩评论