getting git branches of a certain age
My organization is using git branching extensively. As a result, we have produced over 2000 branches in the past year. We are now trying to adopt a strategy for cleaning开发者_如何转开发 up all the old branches that are of some given age. I know how to delete branches, but I can't find a straightforward way to list all of the branches with heads of a given age. The plan is to set up a cron that periodically deletes all branches of a given age, except those that are on some list.
Has anyone tried anything like this before?
The answers using committer dates are a good direction... if you want to delete branches that point to old commits. But you might want to delete branches which are actually old; if you create a branch today pointing to a commit from last year, you don't want it wiped!
So, you want to examine the reflog dates instead.
You can get a human-readable form with git reflog show --date=local mybranch
:
8b733bc mybranch@{Tue Mar 22 13:21:49 2011}: commit: foo
7e36e81 mybranch@{Tue Mar 22 13:21:25 2011}: commit: bar
99803da mybranch@{Tue Mar 22 13:20:45 2011}: branch: Created from otherbranch
(You might also like --date=relative
)
The entry on the top is the most recent thing that happened on that branch, so that's the bit we care about. Unfortunately, there's no log format placeholder for just the date, so to grab out just the date, we do a little work:
git log -g -n 1 --date=local --pretty=%gd mybranch | sed 's/.*{\(.*\)}/\1/'
# Prints "Mon Mar 21 13:23:26 2011"
Of course, for scripting, that's not very useful, so let's go ahead and get the epoch time instead:
git log -g -n 1 --date=raw --pretty=%gd mybranch | sed 's/.*{\(.*\) .*/\1/'
# Prints 1300731806
Now we're getting somewhere!
#!/bin/bash
cutoff_date=$(date --date="July 23, 2010" +%s)
git for-each-ref refs/heads --format='%(refname)' | while read branch; do
reflog_date=$(git log -g -n 1 --date=raw --pretty=%gd $branch -- | sed 's/.*{\(.*\) .*/\1/')
if [ -n "$reflog_date" && "$reflog_date" -lt "$cutoff_date" ]; then
git branch -D ${branch#refs/heads/}
fi
done
An example script! I used date
to convert a human-readable date for the cutoff, then for each branch, I checked if the reflog's last date was before the cutoff, and if so, deleted the branch. You could add in a check against a whitelist there, to save yourself from accidentally deleting something you care about. (Edit: if the branches are older than 90 days, this won't delete them, because their reflogs will already be empty... up to you what you want to do in that case, really. You could fall back to checking the committer date, which ought to be pretty safe at that point.)
Edit: Here's another approach. Expire the reflogs at the cutoff time, then delete the branches whose reflogs are empty. The problem here is that if the cutoff time is older than the time when your reflogs already expire (90 days) it'll really just be deleting branches older than 90 days instead. You could work around that, of course.
#!/bin/bash
# Git lets you use very readable time formats!
cutoff_time="1 year ago"
# other examples:
# cutoff_time="July 23, 2010"
# cutoff_time="yesterday"
git for-each-ref refs/heads --format='%(refname)' | egrep -v 'master|other-whitelisted-branch' |
while read branch; do
git reflog expire --expire="$cutoff_time" $branch
if [ "$(git reflog show -1 $branch | wc -l)" -eq 0 ]; then
git branch -D ${branch#refs/heads/}
fi
done
Update: as Jefromi and cebewee point out below, this solution looks at the 'committer date' of the commit at each branch tip, and in some situations this wouldn't be good enough - to use the former's example, if you care about branches which were recently created based on much older branches, you'd need to use the reflog as in Jefromi's answer. I think that for plenty of situations this is good enough, though, so I'm leaving the answer rather than deleting it...
I did a blog post on this recently, with a script that lists branches in increasing order of the date of the last commit on that branch, which I've found useful for a very similar situation to yours. The script is based around git for-each-ref --sort=committerdate
:
#!/bin/sh
for C in $(git for-each-ref --sort=committerdate refs/heads --format='%(refname)')
do
git show -s --format="%ci $C" "$C"
done
You will need to script it and then use this to grab the date:
git log -1 branchName --format=%ci
this should give you date that you can order by.
Now you just need to iterate over the branches:
for branch in $(git branch -r); do yourscript $branch; done
hope this helps.
I've ended up with this solution:
for k in $(git branch -r | awk -F/ '/\/Your_prefix_here/{print $2}' | sed /\*/d); do
if [ -z "$(git log -1 --since='Jul 31, 2015' -s origin/$k)" ]; then
echo deleting "$(git log -1 --pretty=format:"%ct" origin/$k) origin/$k";
fi;
done
Also It filters branches by a given patter, since we're using try-XX convention for branching.
To find out when a branch last changed (opposed to the date of the last commit in a branch), you need to utilize the reflog. git reflog branchName -1--date=relative
displays for a branch the newest reflog entry and the date of the last change to the branch.
There are various different date formats, choose one which is easy to parse for your use case.
A problem with this solution is that the reflog expires by default in IIRC 90 days. So, if the the last change on a branch is older then 90 days (and you dould a gc), you do not get any information about this branch. You circumvent this by changing the expire time for the reflog, see git-config for this.
A very simple solution, which does not always work:
Just look at .git/refs/heads/
and sort by modification date.
Note that this won't work if the repository is newer or something, as git simply doesn't track such information. But it might work very well in the future, when you have a central repository which otherwise doesn't change much.
精彩评论