An Efficient O( $N$ ) Comparison-Free Sorting Algorithm

In this paper, we propose a novel sorting algorithm that sorts input data integer elements on-the-fly without any comparison operations between the data—comparison-free sorting. We present a complete hardware structure, associated timing diagrams, and a formal mathematical proof, which show an overall sorting time, in terms of clock cycles, that is linearly proportional to the number of inputs, giving a speed complexity on the order of O(N). Our hardware-based sorting algorithm precludes the need for SRAM-based memory or complex circuitry, such as pipelining structures, but rather uses simple registers to hold the binary elements and the elements’ associated number of occurrences in the input set, and uses matrix-mapping operations to perform the sorting process. Thus, the total transistor count complexity is on the order of O(N). We evaluate an application-specified integrated circuit design of our sorting algorithm for a sample sorting of N = 1024 elements of size K = 10-bit using 90-nm Taiwan Semiconductor Manufacturing Company (TSMC) technology with a 1 V power supply. Results verify that our sorting requires approximately 4– $6~\mu \text{s}$ to sort the 1024 elements with a clock cycle time of 0.5 GHz, consumes 1.6 mW of power, and has a total transistor count of less than 750 000.

[1]  Deepak Garg,et al.  Selection of Best Sorting Algorithm for a Particular Problem , 2009 .

[2]  Enzo Mumolo VHDL Design of a Scalable VLSI Sorting Device Based on Pipelined Computation , 2004 .

[3]  Steven K. Feiner,et al.  Computer Graphics - Principles and Practice, 3rd Edition , 1990 .

[4]  Yusuf Leblebici,et al.  Full-custom CMOS realization of a high-performance binary sorting engine with linear area-time complexity , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[5]  Yijie Han Deterministic sorting in O(nlog log n) time and linear space , 2002, STOC '02.

[6]  Aishy Amer,et al.  An FPGA Architecture of Stable-Sorting on a Large Data Volume : Application to Video Signals , 2007, 2007 41st Annual Conference on Information Sciences and Systems.

[7]  Fabrizio Silvestri,et al.  Sorting on GPUs for large scale datasets: A thorough comparison , 2012, Inf. Process. Manag..

[8]  Omar Usman Khan,et al.  Fast Parallel Sorting Algorithms on GPUs , 2012 .

[9]  Stephan Olariu,et al.  Efficient VLSI architectures for Columnsort , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[10]  Suman Roychoudhury,et al.  Choosing the "Best" Sorting Algorithm for Optimal Energy Consumption , 2009, ICSOFT.

[11]  Simon W. Moore,et al.  Tagged Up/Down Sorter - A Hardware Priority Queue , 1995, Comput. J..

[12]  A. Cicuttin,et al.  SORTCHIP: a VLSI implementation of a hardware algorithm for continuous data sorting , 2003, IEEE J. Solid State Circuits.

[13]  Yanjun Zhang,et al.  A simple and efficient VLSI sorting architecture , 1994, Proceedings of 1994 37th Midwest Symposium on Circuits and Systems.

[14]  Sanguthevar Rajasekaran,et al.  Efficient out-of-core sorting algorithms for the Parallel Disks Model , 2011, J. Parallel Distributed Comput..

[15]  Ezequiel Herruzo,et al.  A New Parallel Sorting Algorithm based on Odd-Even Mergesort , 2007, 15th EUROMICRO International Conference on Parallel, Distributed and Network-Based Processing (PDP'07).

[16]  Michael Garland,et al.  Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[17]  Sunil P. Khatri,et al.  Sorting binary numbers in hardware - A novel algorithm and its implementation , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[18]  Chen Gong,et al.  Efficient sorting design on a novel embedded parallel computing architecture with unique memory access , 2013, Comput. Electr. Eng..

[19]  Toshio Nakatani,et al.  AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[20]  Heiko Schröder VLSI-sorting evaluated under the linear model , 1988, J. Complex..

[21]  I. Skliarova,et al.  Implementation of sorting algorithms in reconfigurable hardware , 2012, 2012 16th IEEE Mediterranean Electrotechnical Conference.

[22]  John P. Hayes,et al.  Computer Architecture and Organization , 1980 .

[23]  Joachim M. Buhmann,et al.  The information content in sorting algorithms , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[24]  Takahiro Watanabe,et al.  A sorting-based IO connection assignment for flip-chip designs , 2013, 2013 IEEE 10th International Conference on ASIC.

[25]  M. Russo,et al.  A novel class of sorting networks , 1996 .

[26]  M. Afghahi A 512 16-b bit-serial sorter chip , 1991 .

[27]  Valery Sklyarov,et al.  FPGA-based implementation of recursive algorithms , 2004, Microprocess. Microsystems.

[28]  Yin-Te Tsai,et al.  An efficient external sorting algorithm , 2000, Inf. Process. Lett..

[29]  Biing-Feng Wang,et al.  Efficient algorithms for the inverse sorting problem with bound constraints under the l∞-norm and the Hamming distance , 2009, J. Comput. Syst. Sci..

[30]  C. Greg Plaxton,et al.  Breaking the Theta (n log² n) Barrier for Sorting with Faults , 1997, J. Comput. Syst. Sci..

[31]  Sunggu Lee Advanced Digital Logic Design Using VHDL, State Machines, and Synthesis for FPGA's , 2005 .

[32]  Shengnan Dong,et al.  A Novel High-Speed Parallel Scheme for Data Sorting Algorithm Based on FPGA , 2009, 2009 2nd International Congress on Image and Signal Processing.

[33]  Philippas Tsigas,et al.  GPU-Quicksort: A practical Quicksort algorithm for graphics processors , 2010, JEAL.

[34]  Fritz Henglein,et al.  What is a Sorting Function? , 2009, J. Log. Algebraic Methods Program..

[35]  Viktor K. Prasanna,et al.  Energy and Memory Efficient Mapping of Bitonic Sorting on FPGA , 2015, FPGA.

[36]  C. Canaan,et al.  Popular sorting algorithms - TI Journals , 2012 .

[37]  Robert Meolic,et al.  Demonstration of Sorting Algorithms on Mobile Platforms , 2013, CSEDU.

[38]  Markus Püschel,et al.  Streaming Sorting Networks , 2016, TODE.

[39]  Ryan Kastner,et al.  Resolve: Generation of High-Performance Sorting Architectures from High-Level Synthesis , 2016, FPGA.

[40]  Jun-Dong Cho,et al.  A fast VLSI implementation of sorting algorithm for standard median filters , 1999, Twelfth Annual IEEE International ASIC/SOC Conference (Cat. No.99TH8454).

[41]  Mikkel Thorup Randomized sorting in O(n log log n) time and linear space using addition, shift, and bit-wise boolean operations , 1997, SODA '97.

[42]  Vladimir Stojanovic,et al.  Comparative analysis of master-slave latches and flip-flops for high-performance and low-power systems , 1999, IEEE J. Solid State Circuits.

[43]  Robert Sedgewick,et al.  Fast algorithms for sorting and searching strings , 1997, SODA '97.

[44]  Dong Fuguo,et al.  Several Incomplete Sort Algorithms for Getting the Median Value , 2010 .

[45]  N. Tabrizi,et al.  An ASIC design of a novel pipelined and parallel sorting accelerator for a multiprocessor-on-a-chip , 2005, 2005 6th International Conference on ASIC.

[46]  Jin-Tai Yan,et al.  An improved optimal algorithm for bubble-sorting-basednon-Manhattan channel routing , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[47]  Li Xiao,et al.  Improving memory performance of sorting algorithms , 2000, JEAL.

[48]  Giuseppe Campobello,et al.  A scalable VLSI speed/area tunable sorting network , 2006, J. Syst. Archit..

[49]  Ann Gordon-Ross,et al.  A Gigahertz Digital CMOS Divide-by-N Frequency Divider Based on a State Look-Ahead Structure , 2011, Circuits Syst. Signal Process..

[50]  Guo Tao,et al.  High-speed FPGA-based SOPC application for currency sorting system , 2011, IEEE 2011 10th International Conference on Electronic Measurement & Instruments.