Azure Table Storage data Consistency
Let say I have Table in Azure Table Storage
public class MyTable
{
public string PK {get; set;}
public string RowPK {get; set;}
public double Amount {get; set;}
}
And message in Azure Queue which says Add 10 to Amount.
No开发者_开发问答w let say one worker role
- Takes this message from queue
- Takes row from table
- Amount += 10
- Updates Row in Table
- And Fails
After a while message is available in queue again. So next worker role:
- Takes this message from queue
- Takes row from table
- Amount += 10
- Updates Row in Table
- Removes message from queue
Those actions results in Amount += 20
instead of Amount += 10
.
How can I avoid such situations?
I would suggest that you implement a sort of optimistic concurrency. The message you send to update the row should contain both the "previous value" and the "new value" of the amount property.
So the second worker role that tries to update the row will first check that the current value is still equal to "previous value". If not the worker role knows something went wrong and he can for example just cancel the message without doing the update. And perhaps also raise an error in some log.
All the messages that you put on the queue must be idempotent. There is always a chance a worker role won't finish his job so the message must be repeatable.
So instead of amount += 10 as a task do something like amount = 300. Get the current amount in the webrole add 10 to it and place the new amount on the queue.
I'm not sure if this is the correct way. If you do it like this there will be a problem if two webroles try to add 10 at the same moment.
did you implemented this or are the lines of code up there just a few thoughts?
"amount" implies that you are thinking of some kind of bank transaction scenario. It would be probably better to work directly with SQL Azure (since you have ACID guarantees: http://blogs.msdn.com/ssds/archive/2009/03/12/9471765.aspx „We have always supported full ACID capabilities in the service and will continue to do so.”)
Afaik, we can say that "tables" in windows azure are something like googles bigtable, aren't they?
- Have a unique MessageId for your message in the Queue.
- Worker role reads the message from the queue, reads entity from the table
- Updates the Amount field
- Makes a Batch operation and inserts 2 rows back to the table. First one is the updated entity merged back to table and second one is an entity inserted with same Partition Key and Message Id as the row key.
Batch operation will be executed atomically.
- Now in your example, if another worker role tries to process the same message second time, that operation will fail because the message Id already exists in the table. Worker role then should catch that status code from storage exception and remove message from the queue.
This is fully idempotent and you can scale out worker roles as much as you want. Additionally you do not rely on message order in the queue which does not guarantee FIFO.
精彩评论