开发者

Where do I get "junk" data to help test my code?

For my C class I've written a simple statistics program -- it calculates max, min, mean, etc. Anyway, I've gotten the pr开发者_开发知识库ogram successfully compiled, so all I need to do now is actually test it; the only problem is that I don't have anything to test with.

In my case, I need a list of doubles -- my program needs to accept between 2 and 1,000,000; Is there some resource online that can produce lists of otherwise meaningless data? I know Lorem Ipsum gets used for typesetting, and I'm wondering if there's something similar for various types of numerical data.

Or am I out of luck, and I'll have to just create my own junk data?


The problem with testing software is not the source of the data, but the test set. I mean, can you test an int sum(int a, int b) method by just inputting random numbers to it? No, you need to know what to expect. This is a test set: inputs and expected outputs.

What do you say when you discover that 548888876+99814465=643503341? How can you tell this is the real result?

More than finding random numbers to give your program, you must somehow know the results of your computation in advance in order to compare it.

There are a few ways to do it: what I suggest you is to pick a random number generator (amphetamachine +1) and use the data both on your code and on a program that you already know is good, ie. Matlab for your purposes. After computing your statistics with both, compare your results and see if you coded good or need to do some debug.

By the way, I volountarily altered the result of the above sum...


What about just generating a random double?

Random r = new Random();
for (int i = 0; i < 100000; i++)
{ 
    double number = r.NextDouble();
    //do something with the value
}


Since the data you need will depend on the program, there is no source of generic data that I know of.

If you are able to write that program, you should be able to write a script to generate dummy data for yourself.

Just use a loop to print out random numbers within the range your program can accept.


Generate a file with random bytes:

$ dd \
    of=random-bytes \
    if=/dev/urandom \
    bs=1024 \
    count=1024


http://www.generatedata.com/#generator

I've used that data generator before with some success. To be fair, it will usually involve copy/pasting the data it generates into some other format that you'll be able to read in.

You can generate your own data for this specific case quite easily though. Loop a random number of times with a terminating condition of 1,000,000. Generating random doubles within the range you expect. Feed that in and away you go.

Generating your own test data in this case is probably the best option.


You could take the first million digits of pi and chop them up into however many doubles you want.

The first few could be 3.14159, 2.65358, 9.79323, 8.46264, 3.38327, 9.50288, 4.19716, and 9.39937, for example.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜