开发者

What is your development checklist for Java low-latency application?

I would like to create comprehensive checklist for Java low latency application. Can you add your checklist here?

Here is my list

1. Make your objects immutable

2. Try to reduce synchronized method

3. Locking order should be well documented, and handled carefully

4. Use profiler

5. Use Amdhal's law, and find the sequential execution 开发者_运维问答path

6. Use Java 5 concurrency utilities, and locks

7. Avoid Thread priorities as they are platform dependent

8. JVM warmup can be used

9. Prefer unfair locking strategy

10. Avoid context-switching (many threads lead to counter productive)

11. Avoid boxing, un-boxing

12. Give attention to compiler warnings

13. Number of threads should be equal or lesser than the number of core

Low-latency application is tuned for every milli-seconds.


Although immutability is good, it is not necessarily going to improve latency. Ensuring low-latency is likely to be platform dependent.

Other than general performance, GC tuning is very important. Reducing memory usage will help GC. In particular if you can reduce the number of middle-aged objects that need to get moved about - keep it object either long lived or short lived. Also avoid anything touching the perm gen.


avoid boxing/unboxing, use primitive variables if possible.


Avoid context switching wherever possible on the message processing path Consequence: use NIO and single event loop thread (reactor)


Buy, read, and understand Effective Java. Also available online


Avoid extensive locking and multi-threading in order not to disrupt the enhanced features in modern processors (and their caches). Then you can use a single thread up to its unbelievable limits (6 million transactions per second) with very low latency.

If you want to see a real world low-latency Java application with enough details about its architecture have a look at LMAX:

The LMAX Architecture


Measure, measure and measure. Use as close to real data with as close to production hardware to run benchmarks regularly. Low latency applications are often better considered as appliances, so you need to consider the whole box deployed not just the particular method/class/package/application/JVM etc. If you do not build realistic benchmarks on production like settings you will have surprises in production.


Do not schedule more threads in your application than you have cores on the underlying hardware. Keep in mind that the OS will require thread execution and potentially other services sharing the same hardware, so your application may be requried to use less than the maximunm number of cores available.


  • Consider using non-blocking approaches rather than synchronisation.
  • Consider using volatile or atomic variables over blocking data structures and locks.
  • Consider using object pools.
  • Use arrays instead of lists as they are more cache-friendly.
  • Normally for small tasks sending data to other cores can take more time than processing on a single core because of locking and memory and cache access latency. Hence, consider processing a task by a single thread.
  • Decrease the frequency of accessing main memory and try to work with data stored in caches.
  • Consider choosing a server-side C2 JIT compiler that is focused on performance optimizations contrary to C1 which is focused on quick startup time.
  • Make sure you don't have false object field sharing when two fields used by different threads can be situated on a single cache line.
  • Read https://mechanical-sympathy.blogspot.com/
  • Consider using UDP over TCP


Use StringBuilder instead of String when generating large Strings. For example queries.


Another important idea is to get it working first, then measure the performance, then isolate any bottlenecks, then optimize them, then measure again to verify improvement.

As Knuth said, "premature optimization is the root of all evil".


I think "Use mutable objects only where appropriate" is better than "Make your objects immutable". Many very low latency applications have pools of objects they reuse to minimize GC. Immutable objects can't be reused in that way. For example, if you have a Location class:

class Location {
    double lat;
    double lon;
}

You can create some on bootup and use them over and over again so they never cause allocations and the subsequent GC.

This approach is much trickier than using an immutable location object though, so it should only be used where needed.


In addition to developer level solutions advised here already it can also be very beneficial to consider accelerated JIT runtimes e.g Zing and off heap memory solutions like Teracotta BigMemory, Apache Ignite to reduce Stop-the-world GC pauses. If some GUI involved using Binary Protocols like Hessian, ZERO-C ICE instead of webservice etc is very effective.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜