Can Parallel.ForEach be used safely with CloudTableQuery
I have a reasonable number of records in an Azure Table that I'm attempting to do some one time data encryption on. I thought that I could speed things up by using a Parallel.ForEach
. Also because there are more than 1K records and I don't want to mess around with continuation tokens myself I'm using a CloudTableQuery to get my enumerator.
My problem is that some of my records have been double encrypted and I realised that I'm not sure how thread safe the enumerator returned by CloudTableQu开发者_运维知识库ery.Execute()
is. Has anyone else out there had any experience with this combination?
I would be willing to bet the answer to Execute returning a thread-safe IEnumerator
implementation is highly unlikely. That said, this sounds like yet another case for the producer-consumer pattern.
In your specific scenario I would have the original thread that called Execute read the results off sequentially and stuff them into a BlockingCollection<T>
. Before you start doing that though, you want to start a separate Task
that will control the consumption of those items using Parallel::ForEach
. Now, you will probably also want to look into using the GetConsumingPartitioner
method of the ParallelExtensions library in order to be most efficient since the default partitioner will create more overhead than you want in this case. You can read more about this from this blog post.
An added bonus of using BlockingCollection<T>
over a raw ConcurrentQueueu<T>
is that it offers the ability to set bounds which can help block the producer from adding more items to the collection than the consumers can keep up with. You will of course need to do some performance testing to find the sweet spot for your application.
Despite my best efforts I've been unable to replicate my original problem. My conclusion is therefore that it is perfectly OK to use Parallel.ForEach
loops with CloudTableQuery.Execute()
.
精彩评论