Create quick/reliable benchmark with java?

2023-03-13 21:45 问答作者：

I'm trying to create a benchmark test with java. Currently I have the following simple method:

public static long runTest(int times){
    long start = System.nanoTime();     
    String str = "str";
    for(int i=0; i<times; i++){
        str = "str"+i;
    }       
    return System.nanoTime()-start;     
}

I'm currently having this loop multiple times within another loop that is happening multiple times and getting the min/max/avg time it takes to run this method through. Then I am starting some activity on another thread and testing again. Basically I am just wanting to get consistent results... It seems pretty consistent if I have the runTest loop 10 million times:

Number of times ran: 5
The max time was: 1231419504 (102.85% of the average)
The min time was: 1177508466 (98.35% of the average)
The average time was: 1197291937
The difference between the max and min is: 4.58%

Activated thread activity.

Number of times ran: 5
The max time was: 3872724739 (100.82% of the average)
The min time was: 3804827995 (99.05% of the average)
The average time was: 3841216849
The difference between the max and min is: 1.78%

Running with thread activity took 320.83% as much time as running without.

But this seems a bit much, and takes some time... if I try a lower number (100000) in the runTest loop... it starts to become very inconsistent:

    Number of times ran: 5
    The max time was: 34726168 (143.01% of the average)
    The min time was: 20889055 (86.02% of the average)
    The average time was: 24283026
    The difference between the max and min is: 66.24%

    Activated thread activity.

    Number of times ran: 5
    The max time was: 143950627 (148.83% of the average)
    The min time was: 64780554 (66.98% of the average)
    The average time was: 96719589
    The difference between the max and min is: 122.21%

    Running with thread activity took 398.3% as much time as running without.

Is there a way that I can do a benchmark like this that is both consistent and efficient/fast?

I'm not testing the code that is between the start and end times by the way. I'm testing t开发者_如何学运维he CPU load in a way (see how I'm starting some thread activity and retesting). So I think that what I'm looking for it something to substitute for the code I have in "runTest" that will yield quicker and more consistent results.

Thanks

In short:

(Micro-)benchmarking is very complex, so use a tool like the Benchmarking framework http://www.ellipticgroup.com/misc/projectLibrary.zip - and still be skeptical about the results ("Put micro-trust in a micro-benchmark", Dr. Cliff Click).

In detail:

There are a lot of factors that can strongly influence the results:

The accuracy and precision of System.nanoTime: it is in the worst case as bad as of System.currentTimeMillis.
code warmup and class loading
mixed mode: JVMs JIT compile (see Edwin Buck's answer) only after a code block is called sufficiently often (1500 or 1000 times)
dynamic optimizations: deoptimization, on-stack replacement, dead code elimination (you should use the result you computed in your loop, e.g. print it)
resource reclamation: garbace collection (see Michael Borgwardt's answer) and object finalization
caching: I/O and CPU
your operating system on the whole: screen saver, power management, other processes (indexer, virus scan, ...)

Brent Boyer's article "Robust Java benchmarking, Part 1: Issues" ( http://www.ibm.com/developerworks/java/library/j-benchmark1/index.html) is a good description of all those issues and whether/what you can do against them (e.g. use JVM options or call ProcessIdleTask beforehand).

You won't be able to eliminate all these factors, so doing statistics is a good idea. But:

instead of computing the difference between the max and min, you should put in the effort to compute the standard deviation (the results {1, 1000 times 2, 3} is different from {501 times 1, 501 times 3}).
The reliability is taken into account by producing confidence intervals (e.g. via bootstrapping).

The above mentioned Benchmark framework ( http://www.ellipticgroup.com/misc/projectLibrary.zip) uses these techniques. You can read about them in Brent Boyer's article "Robust Java benchmarking, Part 2: Statistics and solutions" ( https://www.ibm.com/developerworks/java/library/j-benchmark2/).

Your code ends up testing mainly garbage collection performance because appending to a String in a loop ends up creating and immediately discarding a large number of increasingly large String objects.

This is something that inherently leads to wildly varying measurements and is influenced strongy by multi-thread activity.

I suggest you do something else in your loop that has more predictable performance, like mathematical calculations.

In the 10 million times run, odds are good the HotSpot compiler detected a "heavily used" piece of code and compiled it into machine native code.

JVM bytecode is interpreted, which leads it susceptible to more interrupts from other background processes occurring in the JVM (like garbage collection).

Generally speaking, these kinds of benchmarks are rife with assumptions that don't hold. You cannot believe that a micro benchmark really proves what it set out to prove without a lot of evidence proving that the initial measurement (time) isn't actually measuring your task and possibly some other background tasks. If you don't attempt to control for background tasks, then the measurement is much less useful.

继续阅读：benchmarking

Create quick/reliable benchmark with java?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？