Why did --cached option on filter-branch remove files from working directory?
I needed to remove some Xcode files from an old repo that should have been ignored. So I ran the following command
git filter-branch --index-filter 'git rm -f --cached --ignore-unmatch *mode1v3 *pbxuser' HEAD
My understanding was that adding --cached would not affect the current working directory, but git deleted those matching files too. Luckily I had开发者_C百科 a backup(!) but I'm curious why it does this, or am I misunderstanding what --cached
does?
The culprit is not the git rm
command. Its --cached
option works indeed as you say. You can easily try that in a small git repo.
Although the man page does not mention it, git filter-branch
does not seem to preserve your working area. Actually the command refuses to run if your working area is not clean, which is an indication already.
But even if the files are gone from the working area, they are not gone from the repo. They are just no longer in any commit reachable in your current branch. But filter-branch stores are reference to your branch before rewriting to reference name space refs/original/.
Use command git show-ref
to see it.
You could check out the old version to access your removed files. You could use command
git cat-file blob refs/original/refs/heads/master:foo
to get the contents of the file without checking out (use the reference shown by show-ref, foo is the name of the desired file). There are plenty of possibilities
You can use gitk --all
to navigate through both your rewritten and your current branches and you will see that nothing is really gone.
The behaviour of git-filter-branch
can be surprising, as you've discovered - and it won't protect you from unintended consequences when you run it.
Instead I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative specifically designed for deleting files from Git history. One way in which it makes your life easier here is that it will not delete, or change in any way, files in your latest commit.
You should follow the usage instructions - but the core bit is just this: download the BFG's jar (requires Java 6 or above) and run this command:
$ java -jar bfg.jar --delete-files *{mode1v3,pbxuser} my-repo.git
Any file matching that expression in your repository history - which isn't also in your latest commit - will be deleted. You can then use git gc
to clean away the dead data:
$ git gc --prune=now --aggressive
The BFG is generally much simpler to use than git-filter-branch
- the options are tailored around these two common use-cases:
- Removing Crazy Big Files
- Removing Passwords, Credentials & other Private data
Full disclosure: I'm the author of the BFG Repo-Cleaner.
精彩评论