One thread per client. Doable?
I'm writing a Java server which uses plain sockets to accept connections开发者_Go百科 from clients. I'm using the fairly simple model where each connection has its own thread reading from it in blocking mode. Pseudo code:
handshake();
while(!closed) {
length = readHeader(); // this usually blocks a few seconds
readMessage(length);
}
cleanup();
(Threads are created from an Executors.newCachedThreadPool()
so there shouldn't be any significant overhead in starting them)
I know this is a bit of a naive setup and it wouldn't really scale well to many connections if the threads were dedicated OS threads. However, I've heard that multiple threads in Java can share one hardware thread. Is that true?
Knowing that I'll be using the Hotspot VM on Linux, on a server with 8 cores and 12GB of RAM, do you think this setup will work well for thousands of connections? If not, what are the alternatives?
This will scale well for up to hundreds of connections, not to thousands. One issue is that a Java thread takes quite a bit of stack as well (e.g. 256K), and the OS will have problems scheduling all your threads.
Look at Java NIO or framworks that will help you get started doing complex stuff more easily (e.g. Apache Mina)
It is possible this will scale to thousands of clients. But how many thousands is the next question.
A common alternative is to use Selectors and non-blocking I/O found in the java.nio
package.
Eventually you get into the question of whether it's useful to set up your server in a clustered configuration, balancing the load over multiple physical machines.
To have a good perfomance when handling many sockets you usually use a select
approach that is how Unix API handles single-threaded multi-socket applications that need many resources.
This can be done through the java.nio
package that has a Selector
class which basically is able to go through all the opened sockets and notify you when new data is available.
You register all the opened streams inside a single Selector
and then you can handle all of them from just one thread.
You can get additional infos with a tutorial here
The JVM for Linux is using one to one thread mapping. This means that every Java thread is mapped to one native OS thread.
So creating a thousand of threads or more is not a good idea because it will impact your performance (context switching, cache flushes / misses, synchronization latency etc). It also doesn't make any sense if you have less than a thousand of CPUs.
The only adequate solution for the serving many clients in parallel is to use asynchronous I/O. Please see this answer on Java NIO for details.
See also:
- Green threads
- Solaris threading models
Try Netty.
the "one thread per request" model is the way most Java app servers are written. Your implementation can scale as well as they do.
Threads aren't as expensive as they used to be, so an "ordinary" IO implementation can be ok to a point. However if you are looking at scaling to thousands or beyond it is probably worth investigating something more sophisticated.
The java.nio package solves this by providing socket multiplexing/non-blocking IO which allows you bind several connections to one Selector. However this solution is much harder to get right than the simple blocking approach because of the multithreading and non-blocking aspect.
If you wish to pursue something beyond the simple IO then I would suggest looking at one of the good quality network abstraction libraries out there. From personal experience I can recommend Netty which does most of the fiddly NIO handling for you. It does however have a bit of a learning curve but once you get used to the event based approach it is very powerful.
If you have any interest in leveraging deployment and management of an existing container, you might look at making a new protocol handler inside of Tomcat. See this answer to a related question.
UPDATE: This post from Matthew Schmidt claims the NIO-based connector (written by Filip Hanik) in Tomcat 6 achieved 16,000 concurrent connections.
If you want to write your own connector, take a look at MINA to help with NIO abstractions. MINA also has management features which may eliminate need for another container (should you be concerned about deployment of many units and their operation, etc.)
I'd suggest it depends more on exactly what else the server is doing when it processes the messages. If it's relatively lightweight then your machines spec should EASILY cope with merely handling the connections of thousands of such processes. Tens of thousands is another question perhaps, but you only need two machines on the same network to actually empirically test it and get a definite answer.
I think a better approach is to not handle threads yourself. Create a pool (ThreadExecutor or some other stuff) and simple dispatch work to your pool.
Of course, I think asynchronous I/O will make it better and faster, but will help you with socket and networking problems. Only. When your threads block because of I/O, the JVM will put it to sleep and change for another thread until the blocking I/O return. But this will block only the thread. Your processor will continue to run and start to proccess other thread. So, minus the time to create a thread, the way you use I/O not affect not much your model. If you do not create threads (using a pool) your problem is solved.
Why roll your own? You could use a servlet container with servlets, a message queue, or ZeroMQ.
精彩评论