Parallel.ForEach - Forcing Interupt of log running blocking calls
Not happy about this inability to thread abort at all. Example;
You have a simple windows form application that connects to a blocking synchronous web service. Within which it executes a function on the web service within a Parallel loop.
Parallel.ForEach(iListOfItems, po, (item, loopState) =>
{
ParallelProcessIntervalRequest(wcfProxyClient, item, loopState);
});
The web service call takes 2 mins to complete on avg, (could indeed be any call that is blocking such as Thread.Sleep instead, not just a web service) Now i set my MaxDegreeOfParallelism to a pragmatic 20 threads. iListOfItems has 1000 items within it to process.
The user clicks the process button and the loop commences, very nice we have 20 threads all running on the 1000 items in the iListOfItems collection. Great.
However the user needs to close the application for some reason, they close the form. These 20 threads will continue to exceute on all 1000 items which would not be good if only perhaps 40 had been processed thus far, now this would be quite bad as the appl开发者_StackOverflow中文版ication will not exit as the user expects, but will continue to run behind the scenes as seen in task manger.
Say the user tries to rebuild the app again in VS 2010, it reports the exe is still locked, they would have to go into task manager and kill it.
Your shouting, but of course, you should be cancelling said threads using the new parallel cancellation constructs... but you know, the situation does not get much better, the user would still have to wait until the last blocking call has finished, which in our example here is 2 mins. There are many more scenarios that this behaviour causes issues for.
So I chose not to call the Cancel function of the CancellationTokenSource object as that raises an exception which is expensive and also, arguably violates the anti-patten of controlling the flow of the code by Exceptions. So I implement a simple thread safe property - StopExecuting. Within the loop I check the value of StopExecuting and if set to true by an external influence, I make following call from within the loop;
if (loopState.ShouldExitCurrentIteration || loopState.IsExceptional || StopExecuting) {loopState.Stop(); return;}
So the iteration can exit in a 'controlled' manner along with stopping the loop processing further iterations, but as I said, this does little for our dilemma.
The long running-blocking call that is made within the iteration has to complete before we can check if we should stop. So as the user closes the form, the 20 threads may have been asked to stop, but they will only stop when they have finished executing there long running function call - which could be 2 mins on avg.
The same is true for calling Cancel on CancellationTokenSource. Only that once the iteration is finished (not interrupted like a thread abort), an exception is cached ready for when all the other threads eventually finish and return. Calling Cancel on CancellationTokenSource does not appear to throw an exception on the processing thread, which would interrupt the blocking call like a thread abort.
If it did, it would be no different to calling thread abort on the thread anyway, for both ways would result in an exception that could be caught within the thread to deal with closing down and releasing resources etc, before the thread exits.
A thread abort exception is where you can mitigate the claim of "leaving the system in a unstable/undefined state" should an expectational circumstance such as closing the form happen, to subsequently say that sometimes that might not be possible is indeed a matter for the programmer to ensure that it is, by the way they code the loop and indeed what resource handles they chose to maintain. All things considered, the inability to interrupt blocking calls with thread abort like behaviour, feels like a tool has been lost form our bag of tricks. I would have to revert to normal threading constructs in instances like this to gain this single but important ability. Shame.
So is this a problem, a short coming in the new parallel library? If not, how does the library enable us to kill those threads when the form is closed without waiting on blocking calls or leaving the process running behind the scenes. Of course such control would be would relatively straightforward should we use the 'older' threading primitives and actual threads.
One of the best things about the TPL is its extensibility. If you don't like something in it, chances are there's a way to replace the part you don't like. If you want different queuing semantics implement a custom Task
factory; if you want tighter control over the actual threads, their priorities, their Apartment state, ... implement a custom TaskScheduler
.
In your case a custom TaskScheduler
would give you access to all the Threads being used and you could kill them off it you wanted to. Can't say I'd recommend that but it would work.
Sample on my blog or on MSDN.
No, it's not a shortcoming of the PTL.
The elephant in the room is those two-minute webservice calls; they shouldn't take that long.
All the PTL can do is call code. If that code blocks, it has no way of signalling it to stop, so has to wait or abort the thread. I suppose it could kill threads and things, but that's dangerous and will cause far more outrage than not being able to, as people will use it incorrectly (remember Thread.IsBackground
?).
Idea 1
Personal choice would be to make a webservice call that simply puts a message on a queue (or use an ESB like NServiceBus) and returns immediately. That message would then be processed by one or more instances of a separate service (for scalability purposes). That service can then take as long as it likes to process the message, and you are moving the parallelism from the client to the server - no need to use multiple threads on the client means a simpler client.
You can then request the state later on either by polling or sending messages back etc. In the meantime, you may chose to mark some sort of local state as "pending" or something to give feedback to the user.
Idea 2
If the webservice call is out of your control and takes 2 mins, then you're in trouble. You could write your own threading code and handle killing threads when the app closes, but that's a bad idea. You could have a local Windows service as part of your client application that is running all the time, processing messages from MSMQ queues on the local machine. Then you can have that service app call the long-running webservice methods, and still have a client app that closes quickly and is responsive?
Idea 3
Have a proxy webservice that you do control (i.e. write yourself) that basically does what I said in idea 1 above, but where the act of "processing the message" is actually calling the long-running webservice.
IMO, it's not a problem of the parallel library. With every threading construct, you basically have the same principle problems: If you want a neat way to abort, you'll have to implement it yourself. If you were to use a "regular" Thread, and used signalling system to bail out, it would have the same problem as the parallel library: If it just entered a blocking call lasting two minutes, it would take two minutes before bail out occurs.
Since you say you don't have control over the services you call, I think the only option is to see how you can redesign your own code to take into account the possibility of a slow call.
As an aside: I'm by no means an expert on the parallel library, but aren't the threads spawned by that library implemented as background threads, i.e. : Shouldn't they be torn down automatically on application shutdown?
精彩评论