Version control of software refactoring
What is the best way of doing version control of large scale refactoring?
My typical style of programming (actually of writing documents as well) is getting something out as quickly as possible and then refactoring it. Typically, refactoring takes place at the same time as adding other functionality. In addition to standard refactoring of classes and functions, functions may move from one file to another, files get split and merged or just reordered.
For the time being, I am using version control as a lone user, so there is no issue of interaction with other developers at this stage. Still, version control gives me two aspects:
- Backup and ability to revert to a good version "in case".
- Looking at the history tells me how the project progressed and the flow of ideas.
I am using mercurial on windows using TortoiseHg which enables selections of hunks to commit. The reason I mention this is that I would like advice on the granularity of a commit in refactoring. Should I split refactoring from functionality added always in committing?
I have looked at the answers of Refactoring and Source Control: How To? but it doesn't answer my question. That question focuses on collaboration with a team. This one concentrates on having a histo开发者_运维知识库ry that is understandable in future (assuming I don't rewrite history as some VCS seem to allow).
I suppose that there is no one size fits all answer to you question :)
Personally I prefer to keep the finer sensible granularity in my commits: in your case I would split the action in two phases: each one independent:
- refactoring (and following commit)
- new functionalities (and commit).
The best thing to do is to add and commit each item on its own: break up the refactoring in localized changes, and commit them one by one, and add the functionalities one by one, committing them along the way.
There is a little more overhead, but in this way when you go back seeking for differences it is clear what was changed for refactoring and what for adding new functionalities. It's also easier to rollback only a particular problematic addition.
Should I split refactoring from functionality added always in committing?
I tend to check-in frequently; and each check-in is either refactoring, or new functionality. It's a cycle of:
- Refactor existing code (without changing its functionality) to ready it to accept the new code
- Add new code (which implements additional functionality).
I would recommend separating refactoring from adding functionality. Perhaps by alternating checkins. This is from my experiences after I discovered uncrustify and would reformat source files while also making code changes. It became very difficult to figure out a real change from just a reformat. Now uncrustify gets its own dedicated commits.
Having dealt with untangling of effects/bugs/side effects of VERY complicated refactoring combined with fairly extensive changes, I can very strongly advise to always try to separate the two as far as your versioning, as much as possible.
If there are any issues, you can VERY easily re-build the code from the tags/labels/versions pertaining to each stage and verify which of the two introduced the issue.
In addition, try to do refactoring in as small as possible logically complete chunks and commit those as separate checkpoints. Again, this simplifies investigations into what broke why/when.
Every answer thus far has advised you to separate refactoring from adding functionality - and I +1'ed them all. You should do this, independent of source control. Martin Fowler wrote a whole book around the concept that you can't refactor simultaneously with changing functionality. You want to know, for any change, whether the code should and does work the same before the change as after. And as @Amardeep points out, it's much harder to see what functional change you have made if it's hidden by formatting or refactoring changes, thus much harder to track down bugs that functional changes introduced. I don't mean by this to discourage you from refactoring, or to postpone it. Do it, by all means, frequently. But do it separately from functional changes. Micro-commits are the way to go.
Take baby steps. Make the smallest useful change, test it, submit, and repeat.
One kind of change at a time. Don't refactor and change behavior at the same time.
Submit often. Small changes with clear, detailed descriptions are invaluable.
Make sure your automated tests are, reliable, and useful. If you can trust your tests, you can do the above easily and quickly.
Make sure your tests always pass.
Often I will start working on new functionality or a bug fix or whatever, to discover that if I refactor things just so, the new functionality will be much easier to add. Usually I will discard (or save elsewhere) my changes so far, refactor/test/submit, then go back to working the new functionality. Ideally I spend 90% of my time refactoring, and each new feature, bug fix, performance improvement, etc. is a simple, single-line change.
精彩评论