C HTTP server - multithreading model?
I'm currently writing an HTTP server in C so that I'开发者_高级运维ll learn about C, network programming and HTTP. I've implemented most of the simple stuff, but I'm only handling one connection at a time. Currently, I'm thinking about how to efficiently add multitasking to my project. Here are some of the options I thought about:
- Use one thread per connection. Simple but can't handle many connections.
- Use non-blocking API calls only and handle everything in one thread. Sounds interesting but using
select()
s and such excessively is said to be quite slow. - Some other multithreading model, e.g. something complex like lighttpd uses. (Probably) the best solution, but (probably) too difficult to implement.
Any thoughts on this?
There is no single best model for writing multi-tasked network servers. Different platforms have different solutions for high performance (I/O completion ports, epoll, kqueues). Be careful about going for maximum portability: some features are mimicked on other platforms (i.e. select()
is available on Windows) and yield very poor performance because they are simply mapped onto some other native model.
Also, there are other models not covered in your list. In particular, the classic UNIX "pre-fork" model.
In all cases, use any form of asynchronous I/O when available. If it isn't, look into non-blocking synchronous I/O. Design your HTTP library around asynchronous streaming of data, but keep the I/O bit out of it. This is much harder than it sounds. It usually implies writing state machines for your protocol interpreter.
That last bit is most important because it will allow you to experiment with different representations. It might even allow you to write a compact core for each platform local, high-performance tools and swap this core from one platform to the other.
Yea, do the one that's interesting to you. When you're done with it, if you're not utterly sick of the project, benchmark it, profile it, and try one of the other techniques. Or, even more interesting, abandon the work, take the learnings, and move on to something completely different.
You could use an event loop as in node.js:
Source code of node (c, c++, javascript)
https://github.com/joyent/node
Ryan Dahl (the creator of node) outlines the reasoning behind the design of node.js, non-blocking io and the event loop as an alternative to multithreading in a webserver.
http://www.yuiblog.com/blog/2010/05/20/video-dahl/
Douglas Crockford discusses the event loop in Scene 6: Loopage (Friday, August 27, 2010)
http://www.yuiblog.com/blog/2010/08/30/yui-theater-douglas-crockford-crockford-on-javascript-scene-6-loopage-52-min/
An index of Douglas Crockford's above talk (if further background information is needed). Doesn't really apply to your question though.
http://yuiblog.com/crockford/
Look at your platforms most efficient socket polling model - epoll
(linux), kqueue
(freebsd), WSAEventSelect
(Windows). Perhaps combine with a thread pool, handle N connections per thread. You could always start with select
then replace with a more efficient model once it works.
A simple solution might be having multiple processes: have one process accept connections, and as soon as the connection is established fork
and handle the connection in that child process.
An interesting variant of this technique is used by SER/OpenSER/Kamailio SIP proxy: there's one main process that accepts the connections and multiple child worker processes, connected via pipes. The parent sends the new filedescriptor through the socket. See this book excerpt at 17.4.2. Passing File Descriptors over UNIX Domain Sockets. The OpenSER/Kamailio SIP proxies are used for heavy-duty SIP processing where performance is a huge issue and they do very well with this technique (plus shared memory for information sharing). Multi-threading is probably easier to implement, though.
精彩评论