Asynchronous pages in the ASP.NET framework - where are the other threads and how is it reattached?
Sorry for this dumb question on Asynchronous operations. This is how I understand it.
IIS has a limited set of worker threads waiting for requests. If one request is a long running operation, it will block that thread. This leads to fewer threads to serve requests.
Way to fix this - use asynchronous pages. When a request comes in, the main worker开发者_JAVA百科 thread is freed and this other thread is created in some other place. The main thread is thus able to serve other requests. When the request completes on this other thread, another thread is picked from the main thread pool and the response is sent back to the client.
1) Where are these other threads located? Is there another thread pool?
2) IF ASP.NET likes creating new threads in this other thread pool(?), why not increase the number of threads in the main worker pool - they are all running on the same machine anyway? I don't see the advantage of moving that request to this other thread pool. Memory/CPU should be the same right?
3) If the main thread hands off a request to this other thread, why does the request not get disconnected? It magically hands off the request to another worker thread somewhere else and when the long running process completes, it picks a thread from the main worker pool and sends response to the client. I am amazed...but how does that work?
You didn't say which version of IIS or ASP.NET you're using. A lot of folks talk about IIS and ASP.NET as if they are one and the same, but they really are two components working together. Note that IIS 6 and 7 listen to an I/O completion port where they pick up completions from HTTP.sys. The IIS thread pool is used for this, and it has a maximum thread count of 256. This thread pool is designed in such a way that it does not handle long running tasks well. The recommendation from the IIS team is to switch to another thread if you're going to do substantial work, such as done by the ASP.NET ISAPI and/or ASP.NET "integrated mode" handler on IIS 7. Otherwise you will tie up IIS threads and prevent IIS from picking up completions from HTTP.sys Chances are you don't care about any of this, because you're not writing native code, that is, you're not writing an ISAPI or native handler for the IIS 7 pipeline. You're probably just using ASP.NET, in which case you're more interested in its thread pool and how it works.
There is a blog post at http://blogs.msdn.com/tmarq/archive/2007/07/21/asp-net-thread-usage-on-iis-7-0-and-6-0.aspx that explains how ASP.NET uses threads. Note that for ASP.NET v2.0 and v3.5 on IIS 7 you should increase MaxConcurrentRequestsPerCPU to 5000--it is a bug that it was set to 12 by default on those platforms. The new default for MaxConcurrentRequestsPerCPU in ASP.NET v4.0 on IIS 7 is 5000.
To answer your three questions:
1) First, a little primer. Only 1 thread per CPU can execute at a time. When you have more than this, you pay a penalty--a context switch is necessary every time the CPU switches to another thread, and these are expensive. However, if a thread is blocked waiting on work...then it makes sense to switch to another thread, one that can execute now.
So if I have a thread that is doing a lot of computational work and using the CPU heavily, and this takes a long time, should I switch to another thread? No! The current thread is efficiently using the CPU, so switching will only incur the cost of a context switch.
So if I have a thread that makes an HTTP or SOAP request to another server and takes a long time, should I switch threads? Yes! You can perform the HTTP or SOAP request asynchronously, so that once the "send" has occurred, you can unwind the current thread and not use any threads until there is an I/O completion for the "receive". Between the "send" and the "receive", the remote server is busy, so locally you don't need to be blocking on a thread, but instead make use of the async APIs provided in .NET Framework so that you can unwind and be notified upon completion.
Ok, so you're #1 questions was "Where are these other threads located? Is there another thread pool?" This depends. Most code that runs in .NET Framework uses the CLR ThreadPool, which consists of two types of threads, worker threads and i/o completion threads. What about code that doesn't use CLR ThreadPool? Well, it can create its own threads, use its own thread pool, or whatever it wants because it has access to the Win32 APIs provided by the operating system. Based on what we discussed a bit ago, it really doesn't matter where the thread comes from, and a thread is a thread as far as the operating system and hardware is concerned.
2) In your second question, you state, "I don't see the advantage of moving that request to this other thread pool." You're correct in thinking that there is NO advantage to switching unless you're going to make up for that costly context switch you just performed in order to switch. That's why I gave an example of a slow HTTP or SOAP request to a remote server as an example of a good reason to switch. And by the way, ASP.NET does not create any threads. It uses the CLR ThreadPool, and the threads in that pool are entirely managed by the CLR. They do a pretty good job of determining when you need more threads. For example, that's why ASP.NET can easily scale from executing 1 request concurrently to executing 300 requests concurrently, without doing anything. The incoming requests are posted to the CLR ThreadPool via a call to QueueUserWorkItem, and the CLR decides when to call the WaitCallback (see MSDN).
3) The third question is, "If the main thread hands off a request to this other thread, why does the request not get disconnected?" Well, IIS picks up the I/O completion from HTTP.sys when the request initially arrives at the server. IIS then invokes ASP.NET's handler (or ISAPI). ASP.NET immediately queues the request to the CLR Threadpool, and returns a pending status to IIS. This pending status tells IIS that we're not done yet, but as soon as we are done we'll let you know. Now ASP.NET manages the life of that request. When a CLR ThreadPool thread invokes the ASP.NET WaitCallback (see MSDN), it can execute the entire request on that thread, which is the normal case. Or it can switch to one or more other threads if the request is what we call asynchronous--i.e. it has an asynchronous module or handler. Either way, there are well defined ways in which the request completes, and when it finally does, ASP.NET will tell IIS we're done, and IIS will send the final bytes to the client and close the connection if Keep-Alive is not being used.
Regards, Thomas
Async pages in ASP.NET use asynchronous callbacks, and asynchronous callbacks use the Thread Pool, and it is the same thread pool used to serve ASP.NET requests.
However, it's not quite that simple. The .NET ThreadPool
has two types of threads - worker threads and I/O threads. I/O threads use what's called an I/O Completion Port, which is (greatly oversimplifying here) a thread-free or thread-agnostic means of waiting for a read/write operation on a file handle to complete, subsequently running a callback method.
(Note that a file handle does not necessarily refer to a file on disk; as far as Windows is concerned, it could just as well be a socket, pipe, etc.)
A typical .NET web developer doesn't really need to know about any of this. Of course, if you were writing an actual web server, or any kind of network server, then you would definitely need to learn about these, because they are the only way to handle hundreds of incoming connections without actually spawning hundreds of threads to serve them. There's a Managed I/O Completion Port tutorial (CodeProject) if you're interested.
Anyway, getting back on topic; when you interact with the thread pool at a high level, i.e. by writing:
ThreadPool.QueueUserWorkItem(s => DoSomeWork(s));
This does not use an I/O completion port. Ever. It posts the work to one of the normal worker threads managed by thread pool. It's the same if you use async callbacks:
Func<int> asyncFunc;
IAsyncResult BeginOperation(object sender, EventArgs e, AsyncCallback cb,
object state)
{
asyncFunc = () => { Thread.Sleep(500); return 42; };
return asyncFunc.BeginInvoke(cb, state);
}
void EndOperation(IAsyncResult ar)
{
int result = asyncFunc.EndInvoke(ar);
Console.WriteLine(result);
}
Again - same deal. Inside the EndOperation
you're running on a ThreadPool
worker thread. You can verify this by inserting the following debugging code:
void EndSimpleWait(IAsyncResult ar)
{
int maxWorkers, maxIO, availableWorkers, availableIO;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIO);
ThreadPool.GetAvailableThreads(out availableWorkers, out availableIO);
int result = asyncFunc.EndInvoke(ar);
}
Slap a breakpoint in there and you'll see that availableWorkers
is one less than maxWorkers
, while maxIO
and availableIO
are the same.
But some async operations are "special" in .NET. This actually has nothing to do with ASP.NET directly - they'll use I/O completion ports in a Winforms or WPF app too. Examples are:
System.Net.Sockets.Socket
(BeginReceive) and a whole bunch of otherBeginXYZ
methods)System.IO.FileStream
(BeginRead and BeginWrite)System.ServiceModel.ClientBase<T>
(BeginInvoke)System.Net.WebRequest
(BeginGetResponse)
And so on, this is nowhere near a full list. Basically almost every class in the .NET Framework that exposes its own BeginXYZ
and EndXYZ
methods and could conceivably perform any I/O, probably uses I/O completion ports. That's to make it easier for you, the application developer, because I/O threads are kind of hard to implement yourself in .NET.
My guess is that the .NET Framework designers deliberately chose to make it difficult to post I/O operations (compared to worker threads, where you can just write ThreadPool.QueueUserWorkItem
) because it's comparatively "dangerous" if you don't know how to use them properly; by contrast, it's actually pretty straightforward to spawn these in the Windows API.
As before, you can verify what's happening with some debugging code:
WebRequest request;
IAsyncResult BeginDownload(object sender, EventArgs e,
AsyncCallback cb, object state)
{
request = WebRequest.Create("http://www.example.com");
return request.BeginGetResponse(cb, state);
}
void EndDownload(IAsyncResult ar)
{
int maxWorkers, maxIO, availableWorkers, availableIO;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIO);
ThreadPool.GetAvailableThreads(out availableWorkers, out availableIO);
string html;
using (WebResponse response = request.EndGetResponse(ar))
{
using (StreamReader reader = new
StreamReader(response.GetResponseStream()))
{
html = reader.ReadToEnd();
}
}
}
If you step through this one, you'll see that the thread stats are different. The availableWorkers
will match maxWorkers
, but availableIO
is one less than maxIO
. That's because you're running on an I/O thread. That's also why you're not supposed to do any expensive computations in async callbacks - posting CPU-intensive work on an I/O completion port is inefficient and, well, bad.
All of this explains why it's strongly recommended that you use Async pages in ASP.NET when you need to perform any I/O operations. The pattern is only useful for I/O operations; non-I/O async operations will end up being posted to worker threads in the ThreadPool
and you'll still end up blocking subsequent ASP.NET requests. But you can spawn a virtually unlimited number of async I/O operations and not give it a second thought; these won't use any threads at all until the I/O is complete and the callback is ready to begin.
So, to summarize - there is only one ThreadPool
, but there are different kinds of threads in it, and if you're performing slow I/O operations then it's much more efficient to use the I/O threads. It's got nothing to do with CPU or memory, it's all about I/O and file handles.
As for #3, it's not really a question of "why doesn't the request get disconnected", more like a question of "why would it?" A socket doesn't get closed simply because there's no thread currently sending to or receiving data from it, same way your front door doesn't automatically close if there's nobody there to greet guests. Client operations may time out if the server doesn't answer them, and may subsequently choose to disconnect from their end, but that's another issue altogether.
1) The threads are in w3svc or whatever process is running the ASP.NET engine in your particular version of IIS.
2) Not sure what you mean here. You actually have control over how many threads are in the worker thread pool. This article is pretty good: http://msdn.microsoft.com/en-us/library/ms998549.aspx
3) I think you are confusing Requests and connections... To be honest, I haven't a clue how the internals of IIS works, but generally in applications that handle multiple requests simultaneously there is ONE master listening thread that will then hand off the actual work to a child thread (and do nothing else). The original request is not "disconnected" because these things are happening at completely different levels of the network protocol stack. Windows Server has no problem accepting multiple connections on TCP port 80. Think about how TCP/IP works and the fact that it is sending multiple discrete packets of information. You are thinking of "connection" like a single hose going from spigot A to spigot B, but of course that's not how it really works. It is more akin to a bucket that is just collecting whatever gets spilled into it.
Hope this helps.
The answer also depends on which version of IIS you're talking about. In earlier versions, ASP.NET did not use "IIS threads". They were .NET ThreadPool threads. In IIS 7, the IIS and ASP.NET pipelines have been merged. I don't know which threads ASP.NET uses now.
The bottom line is, don't spawn your own threads.
精彩评论