Git and binary data

2022-12-14 15:50 问答作者：

I'm currently starting to use git for my version control system, however I do a fair bit of web/game development which of course requires images(binary data) to be stored. So if my understanding开发者_如何学C is correct if I commit an image and it changes 100 times, if I fetch a fresh copy of that repo I'd basically be checking out all 100 revisions of that binary file?

Is this not an issue with large repo's where images change regularly wouldn't the initial fetch of the repo end up becoming quite large? Has anybody experienced any issue's with this in the real world? I've seen a few alternatives for instance, using submodules and keeping images in a separate repo but this only keeps the codebase smaller, the image repo would still be huge. Basically I'm just wondering if there's a nice solution to this.

I wouldn't call that "checkout", but yes, the first time you fetch repository, provided that binary data is huge and incompressible it's going to be what it is - huge. And yes, since conservation law is still in effect breaking it into modules won't save you space and time on initial pulling of repository.

One possible solution is still using separate repository and --depth option when pulling it. Shallow repositories have some limitations, but I don't remember what exactly, since I never used it. Check the docs. Keyword is "shallow".

Edit: From git-clone(1):

A shallow repository has a number of limitations (you cannot clone or fetch from it, nor push from nor into it), but is adequate if you are only interested in the recent history of a large project with a long history, and would want to send in fixes as patches.

What I do is make the images ignored/untracked directories, and then sync the image directory/directories using other, non-git systems (or just manually copy the image directory changes once, when you're talking about alot of images that you don't need to keep completely synced).

Unfortunately git is not really made for storing binary data. Because it is distributed you would be pulling all versions of all files whenever you clone it. It also becomes ridiculously difficult to prune those large binary files out of your code repository. More about that here: (http://www.somethingorothersoft.com/2009/09/08/the-definitive-step-by-step-guide-on-how-to-delete-a-directory-permanently-from-git-on-widnows-for-dumbasses-like-myself/).

I would recommend trialing it but keep binary files separately from the code (i.e. using submodules). In that case if it doesn't work out for you, you can use another solution without rewriting the whole history for your main repository.

There is a discussion of large file storage with GIT here: http://blog.deveo.com/storing-large-binary-files-in-git-repositories/

I came across this SO question as part of my research and I thought that I would point folks to the blog entry I've already reviewed ( spoiler alert, they recommend git-annex for non-windows users ). .

继续阅读：git

Git and binary data

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？