开发者

Should I put regression tests with large test data in the source code repo?

I have a suite of scripts and modules totaling a few megabytes. The regression tests and the necessary data are hundreds of megabytes b/c of the nature of the data we work with. Is it "b开发者_Python百科est practice" to keep regression tests and large test data with the actual source code?

Note there is a separate set of unit tests which are much smaller and test individual modules. But a run of the major pipelines require real (big) data to be useful.


I think you should look at the various forces that play a role here.

  • A specific version of the tests (and their data) tests a specific version of the code. Therefore it is desirable to be able to commit changes to both tests and code, together.

  • Having large test sets under source control could hurt performance for someone who doesn't always need them: an "svn checkout" (or "cleartool mkview -snaphost", or what have you) copies a lot of files, a test run becomes longer, etc. Therefore it is desirable to separate the big from the small, the integration test from the unit test.

My conclusion is then to keep them together in one repository, but make sure that there is a way to work with everything-except-the-big-tests-and-their-big-data. In Subversion, for instance, one could have folders /code/src, /code/test/unit, /code/test/integration, and /testdata. That way many developers could just "svn checkout .../code", and ignore the big test sets. And your continuous build tool would use the entire tree, so that it can run the integration tests.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜