Should your test data be in the same form as the live data?

2023-01-30 02:20 问答作者：

When testing systems (any system, really, e.g. a database), is it important 开发者_JAVA百科that the test data is in the same form (format) as the live data?

To what degree do you allow differences in the two types of data?

Thanks

Put it this way: the more different your test data is from your live data, the less valuable the testing actually is. So yes, your test data should be as close as possible to your live data.

Barring specific reasons to use fake data, I think it's important to get as close as you can to the live data when testing. Otherwise you will definitely miss issues.

Specific reasons you might use fake data:

live data has privacy or sensitivity concerns; you might use fake credit card numbers (but with the proper format), you might obfuscate names or phone numbers
live data volume is too high for speedy testing; in this case you should select a representative sample
using live data might cause external impacts; for example, you might not want to use real email addresses if emails could go to real users during tests. However, this last one is better solved by mocking your email system.

I try to use both test data that hits specific cases I have designed (often modified from live data); and a significant volume of live data whenever it is available, which hits a large number of scenarios that could definitely impact customers and may include scenarios I haven't thought of.

Keep in mind precisely what you are testing at any given moment. If you are just testing that the data acceptance service grabs files and it should grab any files and then reject bad formats later, then you don't care so much about what is inside the file and you will need at least some other-format test files. In that case, maybe just changing extensions on a notepad file would be enough for the functionality testing, with some large files generated to test file size, etc.

Using non-accurate test data could be especially useful if the format is still being worked out while the devs start work on the other parts of the system. However, you will want to run live or similar-to-live data through every part of your system for integration and end-to-end testing at some point.

I disagree with MusiGenesis, unless you are testing your ability to read from the data source.

If you are just testing how the system performs with certain data, then you can just use mocking to remove all connectivity to external data sources. However, if you need to test things like handling failures in connections and dropping connections, then you will probably want to try to connect to the same type of data source.

I think it's more complex than some people have made out and I would generally have the following test environments

Unit Test - Partial Copy of production data
System Test - Stale but full copy of production data with interfaces from other system test environments
Production Acceptance - Same as system test but fed from other PA systems and may have more data if you use massive data sets
Production maintenance - Copy of production refreshed frequently (e.g weekly) with no interfaces but the ability to implement them quickly. This is used for fixing big production mistakes.

继续阅读：testing

Should your test data be in the same form as the live data?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？