论文信息 - Fast Quicksort Implementation Using AVX Instructions

Fast Quicksort Implementation Using AVX Instructions

This article describes a technique for implementing the quicksort sorting algorithm. Our method ‘vectorizes’ the computations and leverages the capabilities of the advanced vector extensions (AVX) instructions, available on Intel Core processors, and of the AVX2 instructions that were introduced with Intel’s recent architecture codename Haswell. Our solution offers several advantages when compared with other high-performance sorting implementations, such as the radix sort, as implemented in Intel IPP library, or the introsort, as implemented in the C++ STL. In addition to sorting numeric arrays, our method can also be used to sort complex structures with numeric keys and even pointers to such structures.

Shay Gueron | Vlad Krasnov

[1] Pradeep Dubey,et al. Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort , 2010, SIGMOD Conference.

[2] Pradeep Dubey,et al. Efficient implementation of sorting on multi-core SIMD CPU architecture , 2008, Proc. VLDB Endow..

[3] Philippas Tsigas,et al. GPU-Quicksort: A practical Quicksort algorithm for graphics processors , 2010, JEAL.

[4] Sebastian Winkel,et al. Super Scalar Sample Sort , 2004, ESA.

[5] Hui Fan,et al. A Novel Image Median Filtering Algorithm based on Incomplete Quick Sort Algorithm , 2010, J. Digit. Content Technol. its Appl..

[6] Yi Zhang,et al. A simple, fast parallel implementation of Quicksort and its performance evaluation on SUN Enterprise 10000 , 2003, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings..

[7] Peter Sanders,et al. How Branch Mispredictions Affect Quicksort , 2006, ESA.

[8] José Nelson Amaral,et al. Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms , 2007, SPAA '07.

[9] Toshio Nakatani,et al. AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).