开发者

Should I use a simple delay or an exponential backoff

I have a distributed system that basically executes processes (not OS processes, just stuff that needs to be done). after a few unsuccesful tries (timeouts) it notifies a failure.

I want to continue trying to execute the process afterward开发者_开发百科s in the background and the question is: should i use a bigger timeout period? or an increasingly bigger timeout (getting bigger and bigger each try)

  • There are many reasons for a process to fail, mainly network problems.


It depends on the reason for the failure to do something on the first attempt.

If it is due to potential overload / temporary exhaustion of some resource, you might want to try some exponential back off strategy. The reason being, that continuous attempts to acquire that what you want could make things even worse and thus will probably never lead to success.

If you are basically waiting for something to happen or be available e.g. a port being open or a file being there ("polling" basically), you might just want to wait for fixed periods of time.

This is somewhat oversimplified, but may give some basic ideas. Just make sure that you thoroughly test whatever strategy (or combination thereof) you choose, to make sure that it (obviously) actually works and also does not worsen anything.


If there are many reasons why it would fail it might be an option to have a look at redesigning the processes to make them able to continue after something went wrong.


I think the first option is better choice, because if you are going to have bigger and bigger on each try, then if your starting at 1 minute after about 1 hour of failure the next try maybe after 1 day..! 1-> 2, 2 -> 4, 4 -> 8, 8 -> 16..

I will go with the first approach and define a reasonable timeout.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜