开发者

calculating a git packfile sha1 checksum in java

I am learning about the Git packfile and currently trying to reproduce (in Java) what I believe to be the SHA1 20-byte checksum for the entire packfile. I take the byte array from, and including, the "PACK" 4-byte header to the end of the last packaged object's compressed data. Everything I have read indic开发者_运维问答ates that the next 20 bytes is the SHA1 checksum for the entire packfile.

The 20-byte checksum that is part of the byte array received from Git is: B910248BF9B63AC53595E3835CA57BDAF08DA830

I use the following to calculate my own SHA1 checksum:

crypt = MessageDigest.getInstance("SHA-1");

crypt.reset();

crypt.update(testData);

byte [] result = crypt.digest();

My result ends up as: B910248BF9B63AC53595E3835CA57BDAF08DA813

I am baffled at how only the last byte of my result can be different from Git's (if I am using the correct part of the byte stream). If the only problem was the range of data passed to digest() then the entire calculated checksum would most likely look different.

Any ideas?


use JGit:

byte[] data = new byte[] { ... };
ObjectInserter.Formatter f = new ObjectInserter.Formatter();
ObjectId id = f.idFor(OBJ_BLOB, data);
String hash = id.getName();


The git object-id is calculated as such (pseudocode) :

sha1(obj_type | 0x20 | ascii(data_length) | 0x00 | data);

where obj_type can be blob, commit, tree or tag.

Some Java code :

byte[] getObjectId(String type, byte[] input) throws NoSuchAlgorithmException {
    MessageDigest md = MessageDigest.getInstance("SHA1");  
    md.update(String.format("%s %d\u0000", type, input.length).getBytes()); 
    md.update(input);
    return md.digest(); 
}

getObjectId("blob", "helloworld".getBytes()) returns 620ffd0fd9579a46e46ef4505b198ee0a01a57f2. This is same value as what is returned by git hash-object command.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜