Parallel process a intensive IO function
I have this sample code.
List<Dictionary<string,string>> objects = new List<Dictionary<string,string>>();
foreach (string url in urls)
{
objects.add(processUrl(url))
}
I need to process the URL, processUrl
down load the page and run many r开发者_StackOverflowegex to extract some informations and return a "C# JSON like" object, so I want to run this in parallels and in the end I need a list of objects so i need to wait all tasks to continue process, how can I accomplish this? I se many example but none saving the return.
Regards
Like this?
var results = urls.AsParallel().Select(processUrl).ToList();
With Parallel
:
Parallel.ForEach(
urls,
url =>
{
var result = processUrl(url);
lock (syncOjbect)
objects.Add(result);
};
or
var objects = new ConcurrentBag<Dictionary<string,string>>();
Parallel.ForEach(urls, url => objects.Add(processUrl(url)));
var result = objects.ToList();
or with Tasks:
var tasks = urls
.Select(url => Task.Factory.StartNew(() => processUrl(url)))
.ToArray();
Task.WaitAll(tasks);
var restuls = tasks.Select(arg => arg.Result).ToList();
First, refactor as
processUrl(url, objects);
and make the task responsible for adding the results to the list.
Then add locking so two parallel tasks don't try to use the results list at exactly the same time.
Note: async
support in the next version of .NET will make this trivially easy.
You can use PLinq extensions, this requires the .NET 4.0
System.Threading.Tasks.Parallel
.ForEach(urls, url => {
var result = processUrl(url);
lock(objects)
{
objects.Add(result);
}
});
精彩评论