StackOverflowException in a .Net Windows Service
In some circumstances my .Net windows service can generate a StackOverflowException. Unfortunately, the behaviour seems to be that the service simply stops dead and doesn't write anything into the event log. I don't even get a message from the service control manager saying the service has failed.
is there any way at all a windows service can detect that such an exception has occurred?
In the documentation for this exception, MSDN says "Note that an application that hosts the common language runtime (CLR) can specify that the CLR unload the application domain where the stack overflow exception occurs and let the corresponding process continue". this is the kind of thing I would开发者_如何学JAVA expect the windows service implementation to do, but it doesn't.
Please don't just reply saying I should make sure my code never ever throws such an exception - trust me, I would if I could - what I am trying to do is handle the worst case scenario in a sensible way and make my service resilient to unexpected errors.
A stack overflow is about the worst kind of heart attack a thread can suffer. It is so bad that you don't even get something in the event log. It is so bad that you can't even do anything reasonable to recover the state of your program. The thread is dead and so is the state of the appdomain. It got mutated in completely unpredictable ways, you can only throw it away.
Well, you already know all that. But shrugging this off and pretending that it didn't happen causes a different kind of failure. A system failure, the service was supposed to do something and that didn't happen. There are not a lot of scenarios where that's acceptable. A file didn't get processed, a database update didn't happen, etcetera. The kind of mishap that can cause a chain of mishaps later on. Like the CFO discovering that a million bucks is missing at the end of the year.
You didn't want to hear this but there is no sensible way to handle this. Focus all of your efforts on finding the bug, not the band-aid. And stack overflow is always a programming bug.
Okay, a practical answer. You are not stuck with a fixed size of the stack. You can use the Thread(ThreadStart, int) constructor to create one with a larger stack. Give it a couple of dozen megabytes. This should go a large way to avoiding the problem if not completely solve it.
Next thing to do is to start screening the xml file you are given to process. Not so sure if it is the raw size of the file that would cause SO or bad data in the .xml. Start by checking the size of the file and drop it in a separate directory if it is a monster. To be processed manually, preferably by whomever created this file in the first place. And make sure that you've got a couple of trouble-maker files, if you don't have them already. Try to process them off-line with a monster thread stack size. If that still blows, start looking for an algorithm that can pre-screen the .xml content to detect the source of the problem.
Ask another question if you think that the .xml file content might be the cause and you need to find out what kind of bad content could cause this (don't know much of anything about xlt).
精彩评论