Java Profilers That Display Per Request Statistics and Program Flow
I'm looking for profilers that support per request profiling statistics, ideally along the programs flow (not the usual thread call stack). So basically a profiler call stack + sequential calls view for each single request, something like this:
doGet 100ms
+ doFilter 95ms
+ doFilter2 90ms
+ validateValues 20ms
+ calculateX 40ms
+ calc1 10ms
+ calc2 30ms
+ renderResponse 30ms
Which classes / methods are profiled is configured in some way, for a tracing profiler that processes each method call, this is of course not usable.
I know of and have used dynaTrace, its "PurePath" Feature (http://www.dynatrace.com/en/architecture-tame-complexity-with-purepath.aspx) supports this, but I'm looking for tools that are usable in smaller projects and require less of an initial investment and set up.
Does any "classical" profiler (YourKit, etc) support this and I have overlooked the feature?
Addendum: To supply some background: The main goal is to have statistics for the monitoring and analysis of a system in production. First and foremost the idea is to get live statistics of how long requests take and in case response time goes up to have data for certain (types of) requests (think JETM + x).
Per request profiling statistics allows for a detailed analysis of why just some requests are slow, e.g. if 10% of the requests take ten times as long as your average. With aggregated statistics this is AFAIK very hard to solve.
The same goes for a profiling statistics that renders the calls along the program开发者_如何学运维 flow, because it is easy to identify where in the request the problem lies, e.g. a method performs ten DB queries, you see each call as a single one and not just ten aggregated calls.
Ideally the measurement points are configure and en/disabled at run time.
You could try btrace to do selective measurements. It is somewhat similar to dtrace, which you could also use if you are on a supported platform, Solaris, BSD, OS X.
If your application is timing milli-seconds, you could just have a map of time to stage TreeMap which you can summaries and write to a file. This is the most flexible and is fine for milli-second timings.
For micro-second timings, I have an enum value for each stage and then record the current time (System.nanoTime()) when that staging is reached in a ThreadLocal array. (No object allocation) When the request is finished, write the timing deltas to a file e.g. CSV format.
My approach is similar to Peter's but instead of using threadlocals and computing online I write to the log file when the execution reaches interesting stages. Also, I used AspectJ to generate the log lines, which I found very convenient for adding/removing log lines at whim without having to change rest of source code.
精彩评论