We investigate and compare three parallel programming models for implementing parallel programs in C++ on multi-core computer systems. The models under consideration include the tried and tested OpenMP, Intel R ©’s Thread Building Blocks (TBB) and a do-it-yourself approach using Pthreads and Boost.Thread. For demonstration purposes, we create multiple parallel implementations of an algorithm suitable for parallelisation using the above models. The implementations are then compared on their performance characteristics and coding effort required. Additionally, the performance of the GNU C++(G++) and Intel R © C++(ICC) compilers are compared. It is shown that OpenMP requires the least coding effort, while still providing good performance. Pthreads, Boost.Thread and TBB all require significant changes to the structure of the program. The Pthreads and Boost.Thread implementations are more complex than that of TBB, but provide more flexibility, coupled with increased risk. However, TBB promotes a better programming style, abstracts away thread management and has respectable performance. Performance measurements reveal that the SSE optimised Pthreads implementation compiled using ICC performs the best. However, without SSE, our OpenMP program using G++ outperforms the other implementations.
[1]
Sergei Gorlatch,et al.
Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores
,
2009,
Euro-Par.
[2]
Timothy G. Mattson,et al.
Parallel programming: Can we PLEASE get it right this time?
,
2008,
2008 45th ACM/IEEE Design Automation Conference.
[3]
James Reinders,et al.
Intel® threading building blocks
,
2008
.
[4]
Maurice Herlihy,et al.
The art of multiprocessor programming
,
2020,
PODC '06.
[5]
Mitsuhisa Sato,et al.
OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors
,
2002,
15th International Symposium on System Synthesis, 2002..