开发者

Github Repo Corruption - Sha1 Collision

Yesterday one of my team's checkins corrupted our github repo. On github, they were showing this error:

$ git fsck
error: sha1 mismatch 87859f196ec9266badac7b2b03e3397e398cdb18

error: 87859f196ec9266badac7b2b03e3397e398cdb18: object corrupt or missing
missing blob 87859f196ec9266badac7b2b03e3397e398cdb18

When I tried to pull onto a different machine, I got this:

Hyperion:Convoy-clone saalon$ git fsck
warning in tree 5b7ff7b4ac7039c56e04fc91d0bf1ce5f6b80a67: contains zero-padded file modes
warning in tree 5db54a0cdcd5775c09365c19c061aff729579209: contains zero-padded file modes
broken link from    tree 6697c12387f8909cfe7250e9d5854fd6713d25c1
              to    blob 87859f196ec9266badac7b2b03e3397e398cdb18
dangling tree 144becf61ae14cec34b6af1bd8a0cf4f00d346d1
missing blob 87859f196ec9266badac7b2b03e3397e398cdb18

(I get the zero-padded file warnings on both the offending machine and the second machine I pulled to. I get the broken link error only on the second machine).

I tracked down the blob to the specific file that's the problem, but after going through the Git FAQ's process on fixing a broken link error, I had no luck.

I went through Github's documentation and found a process to delete the master repo from github and repush from the offending machine. I tried this, but when I went to re-push the master branch, I got the following error:

fatal: SHA1 COLLISION 开发者_如何学CFOUND WITH 87859f196ec9266badac7b2b03e3397e398cdb18 !
error: unpack failed: index-pack abnormal exit

I've got an open ticket with Github but it's taking them forever to respond. Any idea what the problem might be? Is there a problem at Github that they need to fix, or is there something I can do to take care of this?


After some back and forth with GitHub (and some troubleshooting help from ssmir), this problem is split between a thing I needed to solve and a thing Github needed to solve.

What needed to be solved on my end was this:

Hyperion:Convoy-clone saalon$ git fsck
warning in tree 5b7ff7b4ac7039c56e04fc91d0bf1ce5f6b80a67: contains zero-padded file modes
warning in tree 5db54a0cdcd5775c09365c19c061aff729579209: contains zero-padded file modes
broken link from    tree 6697c12387f8909cfe7250e9d5854fd6713d25c1
              to    blob 87859f196ec9266badac7b2b03e3397e398cdb18
dangling tree 144becf61ae14cec34b6af1bd8a0cf4f00d346d1
missing blob 87859f196ec9266badac7b2b03e3397e398cdb18

If you notice, there's a broken link from a tree to a blob. What this is saying is that there's a folder that should have a file in it, but there's not actually a file in it. Someone added a file to their local repo and pushed it, but the file itself didn't end up in the remote repo. Now every time someone pulls down the repo themselves, they get the same broken git filesystem link.

The instructions here do a good job of explaining what to do if you get the problem, but in the midst of the actual crisis, I found the description a little lacking in context. It gave a clear list of steps but not a great idea of the why - at least, not for someone who's still a little new to Git.

Basically, what you need to do is figure out what file that missing blob is, track down what computer checked it in last and go to work on their local repo. Their computer has both the SHA1 link to the file and the contents of the file itself. Everyone else has a pile of broken.

So first, we need to find out what blobs/files are in that tree. To do that, you use git ls-tree.

git ls-tree 6697c12387f8909cfe7250e9d5854fd6713d25c1

In my case, that listed only one file: the file that was corrupt. In your case, it might give a whole list of files, in which case what you need to do is match up the blob/file's SHA1 hash to the one mentioned in the broken link error. In my case, it was this:

100644 blob 87859f196ec9266badac7b2b03e3397e398cdb18    short_description.html

Notice that it doesn't give you the directory the file is actually supposed to be in. That's kind of frustrating, but with a little detective work you can find it. The file might be uniquely named, in which case you can just do a find for the file name. Or you can look through your commit history and see when and where a file called short_description.html was placed.

Here's the part the GitFaq wasn't entirely clear on. They say to recreate the file, then run this command:

git hash-object -w db/content/page_parts/venues/86/short_description.html 

But what is that doing?

Basically, when you run git hash-object is returns the sha1 hash for that file. And (and here's the important part) it creates a blob from the file, and a blob was just what we were missing. Here's the part it's not clear on, though: In order for this to work, the file needs to match exactly the file that initially caused the problem. In other words, if that short_description.html file had content in it, you can't just create a blank file and run hash-object. If you do, the blob's sha1 hash won't match the one git is missing, and that broken link will still be broken.

This is why you need to be on the offending machine's repo. Everyone else has a link but not file and no blob. The offending machine (hopefully) still has the original file. In my case, they didn't have the original file (in my flailing, it had been deleted inadvertently), but when I looked at their commit history on their box, the diff contained the content of the file that had been committed but never made it to github. I copied that out, recreated the file and ran hash-object. The next time I ran git fsck, the broken link was gone.

One note: technically, this problem can be fixed on someone else's repo, provided you can recreate the missing file. In my case, I actually had the file created on the offending machine, but had it e-mailed to me and fixed the problem in a clean repo on a different system. The important thing is recreating the file exactly so it generates the same sha1 hash that the repo is missing.

As for the SHA1 collision problem I got when I tried to push to github? This ugly sucker?

fatal: SHA1 COLLISION FOUND WITH 87859f196ec9266badac7b2b03e3397e398cdb18 !
error: unpack failed: index-pack abnormal exit

That was a problem in github's side that they needed to fix.


Just a reminder. A small likelihood of something happening is not the same as it not being able to happen. You can get hash collisions with git's use of sha-1. Once you have two files that collide, the likelihood becomes 100%. At that point, there's slim consolation from the theoretical likelihood. Add a space to one and you'll be fine though.


I ran into the same issue and ran:

git prune  
git gc  

which mentioned

error: bad ref for refs/remotes/origin/ticketName

so I removed the reference and that fixed the issue:

rm .git/refs/remotes/origin/ticketName


This happened to me recently on "git pull" from AWS Git server. The commands below fix the issue. thanks

git prune git gc

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜