PLINQ on ConcurrentQueue isn't multithreading
I have the following PLINQ statement in a C# program:
foreach (ArrestRecord arrest in
from row in arrestQueue.AsParallel()
select row)
{
Geocoder geocodeThis = new Geocoder(arrest);
writeQueue.Enqueue(geocodeThis.Geocode());
Console.Out.WriteLine("Enqueued " + ++k);
}
Both arrestQueue
and writeQueue
are ConcurrentQueues.
Nothing is running in parallel:
- While running, total CPU usage is about 30%, and this is with everything else running, too. I have 8 cores (Hyper-Threading on a Core i7 720QM with 4 physical cores), and 4 of the 8 cores have virtually no utilization at all. The rest run roughly 40%-50%.
- Disk usage is usually 0%, and there's no network usage except for queries to a Postgres DB on localhost (see below).
- If I add a breakpoint somewhere inside
geocodeThis.Geocode()
, Visual Studio's Thread dropdown just says [pid] Main Thread. It never goes to any other thread. - I am using Npgsql to connect to Postgres, and each thread runs a few SELECT queries against a table. I am running pgAdmin III's S开发者_开发知识库erver Status app, which shows pg_stat_activity. Through monitoring this, and strategic breakpoint placement (see above), I can see that the app never has more than 1 database connection open for all supposedly concurrent threads running
geocodeThis.Geocode()
. Even if I add Pooling=false to the DB connection string, to force connections not to be pooled, I never see more than 1 connection used ingeocodeThis.Geocode()
. - The Postgres table is indexed on every column in the WHERE clause. Even if it was poorly indexed, I'd expect lots of disk usage. If Postgres was holding things up in any other way, seems like it would soak a core.
This seems like a simple PLINQ case study, and I am scratching my head as to why nothing's running in parallel.
You are parallelizing just the enumeration of the assertQueue
itself and then "unparallelizing" it back into an ordinary IEnumerable
. This all happens before the foreach
loop even starts. Then you use the ordinary IEnumerable
with the foreach
which runs the body of the loop serially.
There are many ways to run the body of the loop in parallel but the first one that comes to mind is using Parallel.ForEach
:
Parallel.ForEach(arrestQueue, arrest =>
{
Geocoder geocodeThis = new Geocoder(arrest);
writeQueue.Enqueue(geocodeThis.Geocode());
Console.Out.WriteLine("Enqueued " + ++k);
});
Foreach over a parallel collection is still a single threaded operation. .AsParallel returns a collection which defines a .ForAll method, which may (but by contract will not always) run in parallel. The code for that would be:
arrestQueue.AsParallel().ForAll(arrest=>
{
Geocoder geocodeThis = new Geocoder(arrest);
writeQueue.Enqueue(geocodeThis.Geocode());
Console.Out.WriteLine("Enqueued " + ++k);
});
精彩评论