Fast parallel in-memory 64-bit sorting

Parallel in-memory 64-bit sorting is an important problem in Database Management Systems and other applications such as Internet Search Engines and Data Mining Tools. We propose a new algorithm that we call Parallel Counting Split Radix sort, PCS-Radix sort. The parallel stages of our algorithm increase the data locality, balance the load between processors caused by data skew and reduce significantly the amount of data communicated. The local stages of PCS-Radix sort are performed only on the bits of the key that have not been sorted during the parallel stages of the algorithm. All those improvements save a significant amount of computational and communication effort. Also, PCS-Radix sort adapts to any parallel computer by changing three simple algorithmic parameters. We have implemented our algorithm on a Cray T3E-900 and the results show that it is more than 2 times faster than the previous fastest 64-bit parallel sorting algorithm. PCS-Radix sort achieves a speed up of more than 23 in 32 processors in relation to the fastest sequential algorithm at our hands.

[1]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[2]  David A. Bader,et al.  Parallel algorithms for personalized communication and sorting with an experimental study (extended abstract) , 1996, SPAA '96.

[3]  Josep-Lluís Larriba-Pey,et al.  The effect of local sort on parallel sorting algorithms , 2002, Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing.

[4]  Chris J. Scheiman,et al.  LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.

[5]  Alexandros V. Gerbessiotis,et al.  Deterministic sorting and randomized median finding on the BSP model , 1996, SPAA '96.

[6]  M. V. Wilkes,et al.  The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .

[7]  David J. DeWitt,et al.  Parallel sorting on a shared-nothing architecture using probabilistic splitting , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[8]  Andrew Sohn,et al.  Load balanced parallel radix sort , 1998, ICS '98.

[9]  Jonathan Schaeffer,et al.  On the Versatility of Parallel Sorting by Regular Sampling , 1993, Parallel Comput..

[10]  Josep-Lluís Larriba-Pey,et al.  Communication conscious radix sort , 1999, ICS '99.

[11]  Jonathan Schaeffer,et al.  Parallel Sorting by Regular Sampling , 1992, J. Parallel Distributed Comput..

[12]  Ramesh C. Agarwal,et al.  A super scalar sort algorithm for RISC processors , 1996, SIGMOD '96.

[13]  Guy E. Blelloch,et al.  A comparison of sorting algorithms for the connection machine CM-2 , 1991, SPAA '91.

[14]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .