Scalability Validation of Parallel Sorting Algorithms

As single-core performance of processors is not improving significantly anymore, the computer industry is moving towards increasing the amount of cores per processor or, in the case of large-scale computers, by installing more processors per computer. Applications now need to scale in accordance with the increase of parallel computing power and software developers need to take advantage of this movement. And parallel sorting algorithms present basic building blocks for many complex applications. In this thesis, we will validate the expected execution time complexities of five state-of-the-art parallel sorting algorithms, implemented in C using MPI for parallelization, by using a scalability validation framework based on Score-P and Extra-P. For each of the parallel sorting algorithms, we will create a performance model. These models will allow us to compare their scalability behaviour to the expectations. Furthermore, we will attempt to parallelize the local sorting step of the splitter-based parallel sorting algorithms via C++11 threads, OpenMP tasks, and CUDA acceleration. We construct the performance models, on which we base our evaluations, using uniformly randomly generated data. For most of the parallel sorting algorithms, we show that the given expectations match the created models. We will discuss any other discrepancies in detail.

[1]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[2]  Guy E. Blelloch,et al.  An Experimental Analysis of Parallel Sorting Algorithms , 1998, Theory of Computing Systems.

[3]  Felix Wolf,et al.  Parallel Sorting with Minimal Data , 2011, EuroMPI.

[4]  Sam White,et al.  A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters , 2012, 2012 41st International Conference on Parallel Processing Workshops.

[5]  Laxmikant V. Kalé,et al.  A Comparison Based Parallel Sorting Algorithm , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[6]  Torsten Hoefler,et al.  Exascaling Your Library: Will Your Implementation Meet Your Expectations? , 2015, ICS.

[7]  W. Donald Frazer,et al.  Samplesort: A Sampling Approach to Minimal Storage Tree Sorting , 1970, JACM.

[8]  Felix Wolf,et al.  A Scalable Parallel Sorting Algorithm Using Exact Splitting , 2010 .

[9]  Laxmikant V. Kalé,et al.  Highly scalable parallel sorting , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[10]  Nancy M. Amato,et al.  A Comparison of Parallel Sorting Algorithms on Different Architectures , 1998 .

[11]  Liu Shenghui,et al.  Internal sorting algorithm for large-scale data based on GPU-assisted , 2013, Proceedings of 2013 2nd International Conference on Measurement, Information and Control.