Investigating the Performance and Code Characteristics of Three Parallel Programming Models for C + +

We investigate and compare three parallel programming models for implementing parallel programs in C++ on multi-core computer systems. The models under consideration include the tried and tested OpenMP, Intel R ©’s Thread Building Blocks (TBB) and a do-it-yourself approach using Pthreads and Boost.Thread. For demonstration purposes, we create multiple parallel implementations of an algorithm suitable for parallelisation using the above models. The implementations are then compared on their performance characteristics and coding effort required. Additionally, the performance of the GNU C++(G++) and Intel R © C++(ICC) compilers are compared. It is shown that OpenMP requires the least coding effort, while still providing good performance. Pthreads, Boost.Thread and TBB all require significant changes to the structure of the program. The Pthreads and Boost.Thread implementations are more complex than that of TBB, but provide more flexibility, coupled with increased risk. However, TBB promotes a better programming style, abstracts away thread management and has respectable performance. Performance measurements reveal that the SSE optimised Pthreads implementation compiled using ICC performs the best. However, without SSE, our OpenMP program using G++ outperforms the other implementations.