开发者

The missing "Comparison of Parallel Processing API". How do I choose Multi-threading library?

I'm using the phrases Parallel Processing & Multi Threading开发者_如何学JAVA interchangeably because I feel there is no difference between them. If I'm wrong please correct me.

I'm not a pro in Parallel Processing/Multi-threading. I'm familiar with & used .NET threads & POSIX Threads. Nothing more than that.

I was just browsing through archives of SO on multi-threading and surprised to see there are so many libraries for Multi Threading.

http://en.wikipedia.org/wiki/Template:Parallel_computing lists down APIs of well known (I'm not sure if any others exist) Multi-Threading Libraries.

  1. POSIX Threads
  2. OpenMP
  3. PVM
  4. MPI
  5. UPC
  6. Intel Threading Building Blocks
  7. Boost.Thread
  8. Global Arrays
  9. Charm++
  10. Cilk
  11. Co-array Fortran
  12. CUDA

Also, I'm surprised to see http://en.wikipedia.org/wiki/Comparison_of_Parallel_Computing_Libraries_(API) is missing.

Till now, I've never been the situation where I need to choose between these libraries. But, If I run into one such situation.

  1. How should I pick one?
  2. What are Pros & Cons of these libraries?
  3. Why do we have so many libraries for attacking a single problem?
  4. Which one is the best?


[1] The right choice of parallel library depends on the type of the target parallel machine: (1) shared memory machine (i.e., multicores) and (2) distributed memory machine (i.e., Cell, Grid computing, CUDA). You also need to consider what kind of parallel programming model you want: (1) general-purpose multithreaded applications, (2) loop-level parallelism, (3) advanced parallelism such as pipeline, (4) data-level parallelism.

First, shared memory model is just multithreaded programming as address space over all computation cores(e.g., chip multi-processors and symmetric multi-processors) is shared. No need to exchange data explicitly between threads and processes. OpenMP, Cilk, TBB are all for this domain.

Distributed memory model used to be a main parallel programming model for super computers where each separate machine (i.e., address space is not shared) is connected via tight network. MPI is the most famous programming model for it. However, this model is still existing, especially for CUDA and Cell-based programming, where memory address space is not shared. For example, CUDA separates memory of CPU and memory of GPU. You explicitly need to send data between CPU memory and GPU memory.

Next, you need to consider parallel programming model. POSIX threads are for general-purpose multithreaded programming (e.g., highly multithreaded web servers). However, OpenMP is very specialized for loop-level parallelism than general POSIX/Win32 thread API. It simplifies thread fork and join. Intel TBB supports various task-level parallelism including loops and pipelines. There is another parallelism that you could exploit: data-level parallelism. For this problem, GPGU would be better than CPU as GPGPU is specialized for data parallel workloads. There are also programming model called streaming processing.

[2] I already answered in the above.

[3] Simple. There are many different parallel/concurrent programming model and different parallel machines. So, it isn't a single problem; There are so many sub problems in parallel/concurrent programming that can't be solved by a super single programming model as of now.

[4] It depends. Seriously.


  • mpi is message passing, not multithreading interface

  • pvm is superseded by mpi for most purposes

  • cilk is dead for most purposes

  • UPC, Co-array Fortran, global arrays are not multithreading libraries, they are for working with distributed memory

  • cuda is for devices very different from regular processors.

  • OpenMP can be limiting if you working outside of computational algebra/applications

  • POSIX threads are de facto standard on UNIX, I am not sure about Windows

  • boost.thread is a universal object oriented wrapper around underlying libraries

  • charm is not widely used outside of computational chemistry/biology

  • Intel blocks has lots of features, but it is a difficult library to use

I think boost thread is a good middle ground. ultimate choice depends in what you are trying to do.


Multi-threading is a way to achieve 'parallel processing', on a suitable computer (multiple processors/cores) and a modern operating system. But 'parallel processing' can also be accomplished on a 'share-nothing' cluster of machines that communicate via a network of some sort (ethernet for pennies and myrinet or dolphin for rich research groups/companies) - where each computer may have a single CPU and run a single user thread most of the time.


You can exploit the inherent parallelism of your process by organizing work into threads. However whether the work truly runs in parallel or not depends upon the underlying hardware (single versus multiple CPUs, engines or cores).

There are also threading models that bind to a system task or thread and others were the threads share the underlying tasks. On a multi-processor the underlying tasks can run in parallel.

If you abstract the code correctly and it can take advantage of the threads then it will run fine on a single or multi-processor, exploiting the multiple engines when/if available.


The right choice depends on the architecture you're targeting.

MPI and PVM are message passing interfaces that are generally used to coordinate work between nodes in cluster-type systems (that is, multiple independent compute nodes with network interconnects). Threads in their many forms are used to spread work across multiple cores or processors within a single system image (i.e. with shared memory).

Also note that the choices aren't necessarily mutually exclusive. In grad school I wrote a program that used MPI to communicate between nodes of a supercomputer, and used pthreads to take advantage of SMP at each node. This is not an uncommon approach in scientific computing.


.NET 4.0 will have some build in parallel support: http://code.msdn.microsoft.com/ParExtSamples

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜