Revision control system with multiple similar projects (one customized for each customer of a same product)
I have multiple SW p开发者_StackOverflow社区rojects from different customers with which I am implementing git version control system and I would like to know what is the best way to do it. The projects are similar and most often derived from existing ones but unique for each customer. Would I create a repo for each customer or would I create new branches instead.
There is no way anyone here can give you the “best” way with such limited information.
As is the case for many other distributed version control systems, the decision on how you are going to publish your projects (many branches in a single repository, or multiple repositories with one/few branch(es) each) is largely independent of the more important issues of overall history management (how you will handle the history that Git records).
I would start by concentrating on determining which bits of data and history need to be shared across the projects. Looking forward, the history will entail not just what has happened in the past but what you expect to happen in the future.
It sounds like you might want to have a base set of files that is common across all your projects and then have some per-project files/changes layered on top of the base set of files. As a history diagram, it might look like this:
a1--a2--a3 customer-a
/
o--o--o base
\
b1 customer-b
Going forward, it might be good enough to make changes on ‘base’ and just merge them up into the customer branches:
a1--a2--a3--a4------a5 customer-a
/ / /
o--o--o--o--o--o--o--o----o base
\ \ \
b1----------b2--b3--b4 customer-b
Or, maybe you want to “float” the customer specific changes on top of the base changes with something like git rebase
:
Rebase customer changes on top of four new base changes:
a1--a2--a3
/ a1'--a2'--a3' customer-a
/ /
o--o--o--o--o--o--o base
\ \
\ b1' customer-b
b1
Another change for B, and two more changes in base:
a1'--a2'--a3'
/
/ a1''--a2''-a3'' customer-a
/ /
o--o--o--o--o--o--o--o--o base
\ \
\ b1''--b2' customer-b
\
b1'--b2
You can interpret each of the above labels (base, customer-a, customer-b) as branches, but you could just as easily publish each one as a single branch in separate repositories with no loss in functionality (though you might want to develop and test with a working repository that has all the history).
Depending on the nature of the project/customer installations, the project-specific data might not even have to be related to the base code/data. If the projects do not require changes to the base files (e.g. all customization is done in configuration files that are not present in the base itself), then you might keep a simple linear histories (branches) for the configuration of each project along with whatever history you like (branches/tags) for the base files. Then for each installation, you could checkout the base and the project-specific configuration and set it all running.
You might even elect to keep each project's history totally independent and just apply patches as needed to the various projects (but that really amounts to forsaking much of the benefit of using a powerful version control system). You could still publish such a set of “unrelated” branches in a single repository (how you publish really is independent of how you manage history).
Which way you go depends on how you want to manage the history.
- Do you want nice, separate clumps of customer-specific changes at the tips of the history?
- Will multiple people have access to the repositories?
- Rewriting history (e.g. with rebase) can be painful for multiple users, or even just multiple working repositories of a single user.
- How interrelated are the project-specific changes and the base code/data?
- Are some projects closely related to (share most changes with) other projects?
- Is there a tool (StGit, TopGit, Guilt, etc.) that might help manage per-project history in a convenient way?
Besides just pure history management, you will also have to consider the capabilities of your Git service provider (e.g. GitHub, git.or.cz, a custom gitorious/gitosis/… installation, etc.). If you have multiple developers and want to restrict them to working on certain projects/customers, you may have to publish to multiple repositories (if the Git service you are using does not allow for per-branch permissions). But the history management is the part that is best to get right. You can always change how you publish your history, it is much more painful to have to rewrite published history.
My advice it to read as much of the documentation as possible and make you own informed decisions. The Git User's Manual is a good place to start. I particularly liked Git for Computer Scientists and Git from the bottom up for understanding Git from the inside out. Having a firm grasp on the internals of Git really helps you understand both the (conceptually) simple (git merge) and the more complex (git cherry-pick, git rebase [-i]) Git commands.
i will go for creating new branches. this way less repeated code. DRY is the thumb rule
You could consider having some kind of a library which would be linked to both projects. And put things that could be used in more than one projects into that lib. They will automatically appear in all projects.
As for the projects themselves having a single repo is definitely better. Keep a clean history and you can avoid merges at all, just cherry-pick the revisions you need from another branch (that can't be split out into the lib).
So I'd stick to two repos. One for the shared library and one for all your projects based on it.
Note: you will always have "one repo per customer".
The question is: How that repo is used? If it is created from one unique central repo, it will get all the history of that repo.
To elaborate one the "If you have multiple developers and want to restrict them to working on certain projects/customers, you may have to publish to multiple repositories" of Chris Johnsen's answer:
If you have confidentiality issues between customer, you might consider:
- one central repo (which denies any push to it), which represents the common template
- one cloned repo per customer where the work is done (in one customer-specific branch)
- one private repo (which denies any pull from it) where you can push back the work done from the customer, in several branches.
From that private repo, you can:- update the template part with some improvements from customer branches
- export patches and apply them on the public central repo.
The push/pull deny mechanisms would then depend on the way you are sharing/accessing your central repo (either through public Git service provider, or through on of the 8 ways to share your repos)
精彩评论