Continuous integration with distributed source code control
I think I misunderstand something but can’t find what exactly. I googled, but didn't get the idea. There are two popular techniques – continuous integration and distributed source code control. People somehow combine them, but I don’t understand how.
AFAIK, continuous integration means commit to the central repository (push) as soon as you’ve tested your code locally. At t开发者_Python百科he same time, distributed systems are loved so much, among other things, because you can commit and commit and commit locally and play with the code and push it to the others only when you are confident and satisfied enough with it. So, though it doesn’t force, it, however, encourages not to hurry with push. It seems to me that classic for CI push every several hours won’t take place.
So how and when do you link these two things together? Or am I wrong in what I said?
EDIT
I read the first three answers. Thank you for the response. I'm still confused, but now I can formulate the question more accurate.
In distributed systems there is not so much of desire of frequent commits, then in centralized. So are there any guidelines on how often to publish in distributed systems to comply with CI? Is it still several times a day or is there another version of this rule?
Distributed Source Control and Continous Integration aren't mutually exclusive concepts. In fact they play very well together.
Even though DVCS are by their nature distributed, you will still have a central repository that represents the traditional "trunk" found in centralized version systems. You should not change your development model in terms of when and what changes you "publish" to your central repository. Because DVCS don't force you to push your changes you need to be very disciplined in that regard.
On the other hand, DVCS enables developers to do smaller, incremental commits on their private branches. Not only are changes easier to follow this way, they are also easier to merge at the end. Having local commits is especially useful when spiking a feature or doing experimental changes. Or when you need to interrupt your work on feature A to fix the very important bug B.
The individual developer decides what get's pushed/published when. As always, with additional power comes additional responsibility.
You should push/publish changes whenever they are ready. For example I want to rename a class. This will touch 50+ files, even though just a few lines. I do the rename using a refactoring tool.
In a centralized system I would now have to decide if that is actually worth a commit on its own or if it's part of a larger piece of work I'm currently working on. Out of experience, people usually choose the second option, because you're not sure if you want that to be part of permanent history yet.
In a distributed system I can commit the change locally, I have a clear history seperating between mechanical (refactoring) and functional code changes. At this point, I don't affect anyone else. I might easily revise that decission later before I finally push out my changes. This will be a clean commit on its own then.
The problem in this example is with the following situation: Imagine I rename that class on my local branch or my "deferred commit". In the meantime someone commits new code to trunk that uses the class I just renamed. It will be a hassle to merge my rename.
Sure you could've just published that change the moment you did it. In both Systems. The responsibility is just the same. But since the DVCS encourages you to have smaller, incremental commits, merging will be easier. Both Systems could've provided you with the same "exit strategy" out of that situation if you published your changes early.
A Continuous Integration system is a tool (like for example Cruise Control) that monitors your source code repository once every N minutes.
Every time something changes (somebody commits code), the CI jumps in, runs all tests and then sends the output (failures or not) somewhere, like email or a screen.
CI does not depend in any way with the kind of VCS you use, whether is distributed or not.
There are a number of elements to this, some to do with software, some to do with process.
Version control systems are just that, version control. The ability to roll back, branch and merge, etc. that whether they are centralised or distributed and they both have up and down sides. VCS per se do not help you to code better or run projects better they facilitate the process of development in teams if and when teams are run properly. In other words you can screw up just as royally using SVN or Mercurial as you can without it.
Where CI comes in is rather than code for several days and then commit, then build the project and test, coder commit more frequently, 1/2 day 1 day (max) and the project is built and tested (not released to live). This means that any errors are picked up earlier and can be more easily rectified as less code has been committed and the programmers memory is fresher.
CI can be done manually, or it can be scripted, writing CLI scripts will do it, or one of the number of CI software tools that will integrate with CVS can be setup to do this process automatically or semi-automatically to reduce the management and the ability to make mistakes.
The advantage of using existing tools, Mercurial + Hudson, or SVN and Cruise Control, is that you are piggy backing on the wisdom and experience of people who probably screwed up royally at some point but learnt the lesson. What you cannot do though is get it out of the box and use it and expect it to work with out adding your own process for your own project into the mix.
精彩评论