A Practical Quicksort Algorithm for Graphics Processors

In this paper we present GPU-Quicksort, an efficient Quicksort algorithm suitable for highly parallel multi-core graphics processors. Quicksort has previously been considered as an inefficient sorting solution for graphics processors, but we show that GPU-Quicksort often performs better than the fastest known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors.

[1]  Hubert Nguyen,et al.  GPU Gems 3 , 2007 .

[2]  Robert Sedgewick,et al.  Implementing Quicksort programs , 1978, CACM.

[3]  Dinesh Manocha,et al.  Fast and approximate stream mining of quantiles and frequencies using graphics processors , 2005, SIGMOD '05.

[4]  C. A. R. Hoare,et al.  Algorithm 64: Quicksort , 1961, Commun. ACM.

[5]  Yi Zhang,et al.  A simple, fast parallel implementation of Quicksort and its performance evaluation on SUN Enterprise 10000 , 2003, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings..

[6]  W. Dally,et al.  Efficient conditional operations for data-parallel architectures , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[7]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[8]  Guy E. Blelloch,et al.  Prefix sums and their applications , 1990 .

[9]  Rüdiger Westermann,et al.  UberFlow: a GPU-based particle engine , 2004, SIGGRAPH '04.

[10]  J. T. Robinson,et al.  Parallel Quicksort Using Fetch-and-Add , 1990, IEEE Trans. Computers.

[11]  Alexandru Nicolau,et al.  Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared-Memory Machines , 1989, SIAM J. Comput..

[12]  Richard C. Singleton Algorithm 347: an efficient algorithm for sorting with minimal storage [M1] , 1969, CACM.

[13]  Pat Hanrahan,et al.  Photon mapping on programmable graphics hardware , 2003, HWWS '03.

[14]  David A. Bader,et al.  A Randomized Parallel Sorting Algorithm with an Experimental Study , 1998, J. Parallel Distributed Comput..

[15]  David R. Musser,et al.  Introspective Sorting and Selection Algorithms , 1997, Softw. Pract. Exp..

[16]  David J. Evans,et al.  The parallel quicksort algorithm part i–run time analysis , 1982 .

[17]  Yao Zhang,et al.  Scan primitives for GPU computing , 2007, GH '07.

[18]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[19]  Udi Manber,et al.  Introduction to algorithms - a creative approach , 1989 .

[20]  Ulf Assarsson,et al.  Fast parallel GPU-sorting using a hybrid algorithm , 2008, J. Parallel Distributed Comput..

[21]  Dinesh Manocha,et al.  GPUTeraSort: high performance graphics co-processor sorting for large database management , 2006, SIGMOD Conference.

[22]  Gabriel Zachmann,et al.  GPU-ABiSort: optimal parallel sorting on stream architectures , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[23]  Michael E. Saks,et al.  The periodic balanced sorting network , 1989, JACM.