git branch vs. git repository, which one should I use?
Which one takes more disk space? How do you track bug fixes(we are using Jira)? How do you know which branch or repository has the bug开发者_运维百科 fix? there must be other advantages/disadvantages to repository vs branch?
Your question doesn't make a whole lot of sense. Repositories always contain at least one branch, and you can't have a branch without a repository. So of course, a branch is smaller than a repository. The two aren't really comparable things.
So, let's talk for a moment about what a Git repository actually contains. There are a few main things:
work tree - this is the current state of all your files, the ones you're working on. This can take up a decent amount of space.
objects - these are the internal way git stores things - blobs to represent the contents of files, trees to represent directory structure, commits to represent snapshots of the work tree. This is the other thing that really takes space.
refs - short for references: branches or tags. These are very, very lightweight - they're just pointers to particular commits. They do take up a little space, but it's so little that you should think of them as completely free. Tags are fixed references; they point to one commit forever, and are used for things like marking versions. Branches are movable references; with one checked out, it advances forward as you commit. They're what you'll use to represent lines of development - a stable branch, a branch for a particular bugfix or feature, you name it.
There are other things inside the .git directory besides objects and refs, but don't concern yourself much with them now.
How do you track bug fixes(we are using Jira)? How do you know which branch or repository has the bug fix?
This is kind of up to you. Generally people end up embedding some information in their commit messages to indicate that they address some particular bug/issue. You should probably have a defined workflow, where you make your bugfixes on their own branches (or perhaps a maintenance branch on top of a previous release), and merge them into your current development branch, thereby sharing the bugfixes with the future version. You should be able to say something like "the master and maintenance branches always have all the bug fixes" - though if you want to check, you can do something like git log --grep='bug 1234'
, assuming you've put that string in your commit message!
In general, there's no need to have multiple clones of the same project's repository. You'll probably have a central one, and each developer will have their own - but when a developer publishes their work, they'll be putting it in the central one, and that's where everything important should be.
A branch always costs virtually nothing to create inside the same repo (technically speaking it depends on what you put in the branch, but I'm sure that was clear to begin with).
When cloning the repo, you could end up with twice the storage requirements (both for the packs and for the working tree). However, this could lead you to believe that cloning repositories is necessarily a suboptimal thing. Let me tell you why that isn't always true/
A repo is actually just a collection of branch refs. A branch actually (as in deep-copying) takes as much as a repo, except for the blobs in commits unique for a specific branch. (Usually not much difference there).
Locally, you can clone a repo at virtually no extra cost because it can be hardlinked (within the same filesystem, that is). Also the cost incurred by having another working tree can be avoided by cloning into a bare repo (e.g. by using git clone --mirror
).
WITHIN A PROJECT
In my opinion, a repo corresponds to a 'player' in distributed workflow. Players can be 'natural persons' or 'role proxies'. I have
- central repo (with web connectivity so I can work from anywhere pushing and pulling)
- work repo (one per workstation; with the ephemeral and temp branches that are associated with ProofOfConcept-ing, merging, micro-committing etc.)
- a backup repo
- [ perhaps a github clone for pull-requests if the upstream devs are on github ]
In practice only 3-4 types of branches are shared across all repo's (master, maint, testing, unstable).
OUTSIDE A PROJECT
Of course, a project wil have it's own repo (most of the time). I am starting to lean towards using submodules because it seems more flexible than heterogenous branches (i.e. different projects in separate branches of the same repo)
One big boon of having a single repo is that it will be very smooth to switch it's associated working tree from one branch to another. To be fair this difference can be worked arounnd in two directions:
- you can switch your working tree to a branch from another repository by adding it as a remote
- you can have a separate (even bare) repo work with a 'non-local' working tree by overriding the GIT_WORKTREE environment variables
You see, when you look at it this way, a repo starts to be a collection of branches, with a more or less convential association to a single working tree. The conventions are where the real differences emerge: you usually use repo's to manage sets of branches and different working tree.
To me, a repository should house a distinct project. A branch is a "pathway" of development on the project. So you might have a development branch, a production branch, and a few feature branches, and some bug fix branches. The branches can be merged with each other as needed and eventually merged into development and production.
Disk space: a whole repository is larger than creating a branch, since a branch is effectively just a pointer when it is first created, while a whole repository needs to store all the git files all over again.
Branches within a repo have advantages beyond the other comments as well. Many graphical (or text UI) tools can show you information about commits/tags/etc on branches within a repo. That data just wouldn't be available if you created a new repository each time you wanted to fracture development.
You will also have an easier time of expressing branch of branch relationships when internal branching is used. It's just natural there, whereas with additional repos you have to track and manage the relationship externally.
精彩评论