What is branched in a repository?
From what I understand of subversion if you have a repo that contains multiple projects, then you can branch individual projects within that repo (see SVN Red book - Using Branches)
However what I don't quite follow is what happens when you create a branch in one of the distributed systems (Git, Hg, Bazaar - I don't think it matters which one). Can you branch just a sub-directory of the repo, or when you create the branch are you branching the entire repo?
This question is part of a larger one that I posted on superuser (choice and setup of version control) and has come about as I am trying to figure out how to best version control a large hierarchal layout of independent projects.
It may be that for distributed systems that what I would like to do is best handled by a sub-project mechanism of some sort - but again that is something I am not clear on although I have heard the term mentioned in regards to 开发者_Python百科git.
With bazaar, if you create two branches in a shared repository, any common history they have lives within the repository and not the branch itself - the branch merely references it. This saves disk space for repositories that have many branches of the same projects for different features as well as speeds up the creation of new branches (you're not having to duplicate the files containing branch history). It's been a while since I looked at hg and git, but I do not believe they have a feature identical to this.
Bazaar does not have sub-projects. A branch is a whole, contiguous unit. You cannot branch portions of it. I believe git and hg both have sub-branches, though.
Subversion being centralized, you can organize your projects within one repo as you want. Since branchbes are emulated as directory with SVN, you end up mixing:
- history isolation (which is the main purpose of a branch: you isolate the versions of a set of files from other versions from the same set of files)
- "component" isolation (a component or module being a group of files each in their own directory)
But with a DVCS, each repository is its own component (or module).
I.e. you don't want to put all your projects within one repo.
Rather you are using submodules (Git) or subrepos (Hg).
That leaves you with the branch as a pure history isolation:
Whe you branch, the history of the all repo creates a new branch ready to record (reference) any new commit you will make.
The is no "cheap copy", just a new pointer made.
Note: Mercurial has a more complex branching model which can involve cloning a repo to create a new branch, but the general principle behind branching stands.
With git, a branch is simply a pointer to the commit at the tip of the branch. It doesn't contain any information of its own. So, your history might look like this:
- o - o - o - o - o (branchA)
\
o - o (branchB)
Each o
there is a commit, which represents the state of the entire repository at that point. The two branches thus in general represent different states of the entire repo, though it could be that they only differ in the contents of one subdirectory. There certainly won't be any wasted space, though; if two commits use the same version of a given file, they internally point to the same object for its contents.
Depending on what you're actually trying to do, you could be interested in using submodules, which are essentially a mechanism for placing repos inside of repos, so that you can have a meta-project repository which contains sub-project (embedded) repositories.
In general, distributed version control systems will only support you to create a new branch out of the whole of an existing branch, rather than (as Subversion does) allowing you to make a copy of a small part of what you're working on. Git at least (and I think some of the others) allows you to reference sub-modules (which are git repositories in their own right).
Git does allow you to do pretty much anything you want, even if it's not particularly useful or obvious (and even if the tools won't really support you in doing it). There's no technical reason why all the branches in a Git repository need to have a common parent or have anything to do with each other at all. There's also nothing stopping you constructing a commit consisting of a sub-tree of its parent commit and Git's change tracking and merging will actually probably cope quite well in this case.
Mercurial at least differs from Git in this regard, as the Mercurial workflow seems tailored to trying to keep separate branches in separate repositories while the git workflow is quite happy with having many branches in the same repository.
精彩评论