Why the result of GZip algorithm is not same in Android and .Net?
Why the result of GZip algorithm is not same in Android and .Net?
My code in android:
public static String compressString(String str) {
String str1 = null;
ByteArrayOutputStream bos = null;
try {
bos = new ByteArrayOutputStream();
BufferedOutputStream dest = null;
byte b[] = str.getBytes();
GZIPOutputStream gz = new GZIPOutputStream(bos, b.length);
gz.write(b, 0, b.length);
bos.close();
gz.close();
} catch (Exception e) {
System.out.println(e);
e.printStackTrace();
}
byte b1[] = bos.toByteArray();
return Base64.encode(b1);
}
My code in the .Net WebService:
public static string compressString(string text)
{
byte[] buffer = Encoding.UTF8.GetBytes(text);
MemoryStream ms = new MemoryStream();
using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
{
zip.Write(buffer, 0, buffer.Length);
}
ms.Position = 0;
MemoryStream outStream = new MemoryStream();
byte[] compressed = new byte[ms.Length];
ms.Read(compressed, 0, compressed.Length);
byte[] gzBuffer = new byte[compressed.Length + 4];
System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
return Convert.ToBase64String(gzBuffer);
}
In android:
compressString("hello"); -> "H4sIAAAAAAAAAMtIzcnJBwCGphA2BQAAAA=="
In .Net:
compressString("hello"); -> "BQAAAB+LCAAAAAAABADtvQdgHEmWJSYvbcp7f0r1StfgdKEIgGATJNiQQBDswYjN5pLsHWlHIymrKoHKZVZlXWYWQMztnbz33nvvvffee++997o7nU4n99//P1xmZAFs9s5K2smeIYCqyB8/fnwfPyLmeVlW/w+GphA2BQAAAA=="
It is interesting that when I use Decompress method in android to decompress the result of .Net compressString method, it returns the original string correctly but I get error when I decompress the result of android compressedString method.
Android Decompress method:
public static String Decompress(String zipText) throws IOException {
int size = 0;
byte[] gzipBuff = Base64.decode(zipText);
ByteArrayInputStream memstream = new ByteArrayInputStream(gzipBuff, 4,
gzipBuff.length - 4);
GZIPInputStream gzin = new GZIPInputStream(memstream);
final int buffS开发者_高级运维ize = 8192;
byte[] tempBuffer = new byte[buffSize];
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while ((size = gzin.read(tempBuffer, 0, buffSize)) != -1) {
baos.write(tempBuffer, 0, size);
}
byte[] buffer = baos.toByteArray();
baos.close();
return new String(buffer, "UTF-8");
}
I think that there is an error in Android compressString method. Can anybody help me?
In the Android version, you should close bos
after you close gz
.
Also, this line in compressString
may give you problems:
byte b[] = str.getBytes();
That will convert the characters to bytes using the default encoding on the device, which is almost certainly not UTF-8. The .NET version, on the other hand, is using UTF8. In Android, try this instead:
byte b[] = str.getBytes("UTF-8");
EDIT: On further looking at your code, I suggest that you rewrite it like this:
byte b[] = str.getBytes("UTF-8");
GZIPOutputStream gz = new GZIPOutputStream(bos);
gz.write(b, 0, b.length);
gz.finish();
gz.close();
bos.close();
The changes are: use UTF-8 to encode characters; use the default internal buffer size for the GZIPOutputStream; call gz.close()
before calling bos.close()
(the latter probably isn't even needed); and call gz.finish()
before calling gz.close()
.
EDIT 2:
Okay, I should have realized before what's going on. The GZIPOutputStream class is, in my opinion, a stupid design. It has no way to define the compression you want and the default compression is set to none. You need to subclass it and override the default compression. The easiest way is to do this:
GZIPOutputStream gz = new GZIPOutputStream(bos) {
{
def.setLevel(Deflater.BEST_COMPRESSION);
}
};
That will reset the internal deflator that GZIP uses to give the best compression. (By the way, in case you aren't familiar with it, the syntax I'm using here is called an instance initializer block.)
According to this answer, I have 4 methods. Android and .net compress and decompress methods. These methods are compatible with each other except in one case.
The main difference is that your .NET code puts the length of the compressed data into the first four byte of the binary data. Your Java codes doesn't do this. It's missing the length field.
When you decompress it, you however expect the length in the first four bytes and start the GZIP decompression at position 4 (skipping the first four bytes).
精彩评论