开发者

Why does the task name contain "now / 30"?

In the video/PDF from "Data pipelines with Google App Engine" Brett puts "now / 30" i开发者_StackOverflownto the task name noting that he will explain the reason later, but somehow he never does. :)

http://www.youtube.com/watch?v=zSDC_TU7rtc#t=41m35

task_name = '%s-%d-%d' % (sum_name, int(now / 30), index)

Do you have any idea about the reason? Does it have anything to do with the 7 day period in which one can't re-use task names?

Link to the session page


Brett Slatkin's own explanation

[Brett]
Hey all,

The int(time.time()/30) part of the task name is to prevent queue stalls. When memcache gets evicted the work index counter will be reset to zero. That means new fork-join work items may insert tasks that are named the same as tasks that were already inserted. By including a time window of ~30 seconds in the task name, we ensure that this problem can only last for about thirty seconds. This is also why you should raise an exception when you see a TombstonedTaskError exception.

Worst-case scenario if the clocks are wonky is that two tasks are run to do the fan-in work instead of just one, which is an acceptable trade-off in many cases and a fundamental possibility when using the task queue API. This can be mitigated using pigeon-hole acknowledgment entities, like I use in my materialized view example.

Hope that helps,
[/Brett]

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜