Error handling design problem on collection of items
I have a co开发者_运维知识库llection of some items and some operation on them. This operation is a part of remote calls between client and server and it should run on all items at once. On server side it runs repeatedly on each item and may fail or succeed. I need to know which items succeeded and which failed. I guess this is rather common case and there are good solutions to it. How should I design it?
it should run on all items at once
You will hate your life if you don't read into this as a design requirement. All or nothing is the right way to handle it. It will simplify everything you do.
If that isn't an option, just do the dumbest thing possible. Wrap each call in a try/catch and give some report. Chances are no one will be able to consume the report, which is another reason all or nothing is the right thing to do.
edit:
To elaborate: When batching, writing simple logic to report errors is fine, but writing logic to recover from errors is very complicated. I've never seen a system really handle recovery well on batching. I'm sure there are some corner cases where each item is completely independent. At which point makes no matter that one or another failed, but that is usually not the case.
Generally, I expect any errors that happen during a batching operation to not be critical. By that I mean the system should be able to ignore errors and continue operating as if the message that caused the error never existed.
If it's really vital that these messages get processed, then I would definately try for all or nothing.
精彩评论