开发者

how node.js server is better than thread based server

Node.js server is works on event based models where callback functions are supported. But I am not able to understand how is it better than traditional thread based servers where threads wait for system IO. In case of thread based model, when a thread needs to wait for IO, it gets preempted so doesn't consume CPU cycles hence doesn't contribute to wait time.

How Node.js improves wait 开发者_StackOverflowtime?


when a thread needs to wait for IO, it gets preempted

Actually, it's not preempted. Preemption is something completely different. What happens is that the thread is blocked.

For an event based model something similar happens. Event based interpreters are basically state machines. Only, the state machine is abstracted away and is not visible to the user. When something is waiting for an event it passes the control back to the interpreter. When the interpreter has nothing else to process it blocks itself waiting for I/O. Only, unlike traditional threading code the interpreter waits for multiple I/O.

What's happening at the C level is that the interpreter is using something like select(), poll(), epoll() and friends (depends on the OS and library installed) to do the blocking and waiting for I/O.

Now, why does a select()/poll() based mechanism generally perform better? Actually, 'generally' here depends on what you mean. A select() based server executes all code in a single process/thread. The biggest performance gain from this is that it avoids context switching - every time the OS transfers control over from one thread to another it has to save all the relevant registers, memory map, stack pointers, FPU context etc. so that the other thread can resume execution where it left off. The overhead of doing this can be quite significant.

In fact, there is a historical example of how extreme the overhead can be. Back in the early 2000s someone started benchmarking web servers. To the surprise of everyone, tclhttpd outperformed Apache for serving static files. Now, tcl is not only an interpreted language, but back in 2000 it was a very slow interpreted language because it didn't have a seperate compilation phase (it sort of does now). Tcl scripts are interpreted directly in string form making it around 400x slower than C. Apache is obviously written in C so what's making tclhttpd faster?

It turned out that tclhttpd is event based running only on a single thread while Apache was multithreaded. The overhead of constant thread switching turned out to give tclhttpd enough advantage to perform better than Apache.

Of course, there is always a compromise. A single threaded server like tclhttpd or node.js cannot take advantage of multiple CPUs. Back in the early 2000s multiple CPUs were uncommon. These days they are almost default. Not to mention that most CPUs are also hyperthreaded (hyperthreading adds hardware to the CPU to make context switching cheap).

The best servers these days have learned from history and are a combination of both. Apache2, and Nginx use therad pools: they are multithreaded but each thread serves more than a single connection. This is a hybrid of the two approaches but is more complex to manage.

Read the following article for a more in-depth discussion on this topic: The C10K problem


Threads are relatively heavy-weight objects that have a resource footprint extending all the way into the kernel. When you park a thread in a blocking syscall or on a mutex or condition variable, you are tying up all those resources but doing nothing. Now the OS has to find more resources so your program can create another thread... Then you idle them too. It doesn't take long before the OS is struggling to scavenge more resources for your program to waste.

CPU time is just one small part of he bigger picture. :-)


Simply put:

In a threaded server, no matter how many threads you have, you can always have that many threads waiting for IO.

In node, no matter how many IO operations are pending, you always have your event loop ready to do the next thing.


When having a lot of threads you are going to have a lot of context switching which is going to be expensive. You want have this overhead when using node.js's Event loop

Context Switch

A context switch is the computing process of storing and restoring state (context) of a CPU so that execution can be resumed from the same point at a later time.

Event loop

In computer science, the event loop, message dispatcher, message loop or message pump is a programming construct that waits for and dispatches events or messages in a program.


I think you are full of myths regarding to threads and cost of context switching.

Discover yourself the truth.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜