开发者

How do you manage large git repositories?

One of our git repositories is large enough that a git-clone takes an annoying amount of time (more than a few minutes). The .git directory is ~800M. Cloning always happens on a 100Mbps lan over ssh. Even cloning over ssh to localhost takes more than a few minutes.

Yes, we store data and binary blobs in the repository.

Short of moving those out, is there another way of making it faster?

Even if moving large files our were an option, how could we do it without major interruption rewriting everyone's开发者_运维百科 history?


I faced the same situation with a ~1GB repository, needing to be transferred over DSL. I went with the oft-forgotten sneakernet: putting it on a flash drive and driving it across town in my car. That isn't practical in every situation, but you really only have to do it for the initial clone. After that, the transfers are fairly reasonable.


I'm fairly sure you're not going to be able to move those binary files out without rewriting history.

Depending on what the binaries are (maybe some pre-built libraries or whatever), you could have a little script for the developer to run post-checkout which downloads them.


Gigabit... fiber... Without rewriting history, you are fairly limited.

You can try a git gc it may clean it up a bit, but I'm not sure if that is done with a clone anyway.


Even if moving large files our were an option, how could we do it without major interruption rewriting everyone's history?

Check this answer: Will git-rm --cached delete another user's working tree files when they pull

This measure, together with adding patterns to .gitignore, should help you keep those big files out.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜