开发者

NIO Performance Improvement compared to traditional IO in Java

I have seen many articles/blogs saying that Java NIO is a better solution compared to traditional Java IO.

But today one of my co-worker showed me this blog http://mailinator.blogspot.com/2008/02/kill-myth-please-nio-is-not-faster-than.html. I am wondering whether anyone from th开发者_高级运维e Java community has done this kind of benchmarking related to Java NIO performance.


NIO vs IO is a pretty fun topic to discuss.

It's been my experience that the two are two different tools for two different jobs. I have heard of IO being referred to as the 'Thread per Client' approach and NIO as the 'One thread for all Clients' approach and I find the names, while not 100% accurate, to be fitting enough.

The real issue with NIO and IO, as I see it, is with scalability.

An NIO network layer will (should?) use a single thread to handle the selector and dispatch read/write/accept jobs to other thread(s). This allows the thread handling the selector (the 'Selector Thread') to do nothing but just that. This allows for much faster response when dealing with a lot (note the lack of actual numbers) of clients. Now, where NIO starts to fall apart is when the server is getting so many read/write/accept that the Selector Thread is constantly working. Any additional jobs past this and the server starts to lag. Additionally, since all read/write/accept jobs are processed by the Selector Thread, adding additional CPUs to the mix will not improve performance.

An IO network layer will likely take the approach of 1 thread per socket, including the listening socket. So the number of threads is directly proportional to the number of clients. Under a moderate amount of clients, this approach works really well. The cost that one pays by using this approach comes in the form of the cost of a thread. if you have 2000 clients attached to a server... you have at least 2001 threads. Even in a quad chip, 6 core per chip machine, you only have 24 processing nodes (48 if you count HyperThreading) to handle those 2001 threads. Instantiating all those Threads costs cpu time and ram, but even if you use Thread Pooling, you still have to pay the cost of the CPUs context switching as they move from thread to thread. This can get very ugly at high server loads, and, if not coded properly, can grind the entire machine to a halt. On the plus side, adding CPUs to the machine WILL improve performance in this case.

Now all that is well and good, but is in the abstract because there's no numbers in my description to aid in making a decision to go with IO or NIO. This is because there's even more variables to consider:

  • Life time of a client? Short or long?
  • Amount of data expected per client? Lots of small chunks or few huge chunks?
  • Just how many clients are expected to be connected simultaneously?
  • What OS are you on and what JVM are you using? Both factor into Thread and polling costs.

Just some food for thought. To answer the question of which is faster, NIO or IO: Both and neither :)


A is faster than B is often a very simplistic view and sometimes plain wrong.

NIO is not automatically faster than plain IO.

Some operations are potentially faster using NIO and you can scale to many network connections much easier with NIO (because you don't need one thread per connection).

But NIO is not a magic "make stuff faster"-switch that needs to be applied to everything.


NIO is used not because it's faster but because it has better scalability especially there are amounts of clients.

IO (Blocking IO/Stream IO) is usually one thread per connection to get better response to the clients. Suppose you use single thread to (blocking)listen/(blocking)read/process/(blocking)write for all the clients, just like Starbucks serves all the customers in a single window, Starbucks customers (your clients) would get impatient (timeout).

Note you may think about thread pool to avoid huge number of threads drag down your server. While it just like Starbucks lines all the customers into several windows, the customers are still delay because of other's blocking. So that's why one thread per connection is a good choice in tradition java IO programming.

NIO (None Blocking IO/Block IO which one to use) uses Reactor Pattern to handle IO events. In this case, you could use single thread to blocking/listen|read|process|write. Then the clients blocking (waiting period) would not affect each other.

Note both IO and NIO can use multiply-threads to utilize more cpu-resources, more details in Doug lee's introduction.


The problem with the article is it compares blocking IO vs non-blocking NIO. In my own tests comparing blocking IO vs blocking NIO (more like for like) NIO is up to 30% faster.

However unless your application is trivial, like a proxy server, its is unlikely to matter. What the application does is far more important. Both IO and NIO have been tested with up to 10,000 connections.

If you want super fast IO you can use Asynch IO (Java 7+) with Infiniband (not cheap, but lower latency)


The article you cite is three years old. It used Java 1.4.2 (4).

Since then Java 5, 6, and now 7 are out.

Huge changes inside of the JVM as well as the class library have rendered anything having to do with benchmarking of 1.4.2 irrelevant.

If you begin to dig, you'll also note that the distinction between java.io and java.nio isn't quite so clear. Many of the java.io calls now resolve into java.nio classes.

But whenever you want increased performance, the solution is not to just do any one thing. The only way to know for sure is to try different techniques and measure them, because what's fast for my application isn't necessarily so for your application, and vice-versa. NIO might well be slower for some applications. Or it might be the solution to performance problems. Most likely, it's a little of both.


Also, AFAIK, Java IO was rewritten to use NIO under-the-covers (and NIO has more functionality). Microbenchmarks are just a bad idea, particularly when they're old, as lavinio states.


Java NIO is considered to be faster than regular IO because:

  1. Java NIO supports non-blocking mode. Non-blocking IO is faster than blocking IO because it does not require a dedicated thread per connection. This can significantly improve scalability when you need to handle lots of simultaneous connections, as threads are not very scalable.

  2. Java NIO reduces data copying by supporting direct memory buffers. It is possible to read and write NIO sockets without any data copying at all. With traditional Java IO, the data is copied multiple times between the socket buffers and byte arrays.


Java NIO and the reactor pattern are not much about networking performance itself, but about the advantages that the single-threaded model can deliver to a system in terms of performance and simplicity. And that, the single-threaded approach, can lead to dramatic improvements. Take a look here: Inter-socket communication with less than 2 microseconds latency


There is no inherent reason why one is faster than the other.

The one-connection-per-thread model is currently suffering from the fact that Java thread has a big memory overhead - thread stack is preallocated to a fixed (and big) size. That can be and should be fixed; then we can cheaply create hundreds of thousands of threads.


Java IO encomapsses several constructs and classes. You cannot compare on such general level. Specifically , NIO uses memory mapped files for reading - This is theoretically expected to be slightly faster than a simple BufferedInputStream file reading. However , if you compare something like a RandomAccess file read , then NIO memory mapped file will be a lot faster.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜