A High-Performance Bidirectional Architecture for the Quasi-Comparison-Free Sorting Algorithm

This paper proposes a high-performance bidirectional architecture for the quasi-comparison-free sorting algorithm. Our architecture improves the performance of the conventional unidirectional architecture by reducing the total number of sorting cycles via bidirectional sorting along with two auxiliary methods. Bidirectional sorting allows the sorting tasks to be conducted concurrently in the high- and low-index parts of our architecture. The first auxiliary method is boundary finding, which shortens the range for index searching by finding the boundaries of the range. The second auxiliary method is queue storing, which stores each useful index in a queue in advance to reduce the number of miss cycles during index searching. The performance of our architecture highly depends on the distribution of input data. For each set of input data to be sorted, five Gaussian distributions of the input data and four standard derivations for each distribution were adopted in our experiments. The results show that at the expense of some additional area cost, the number of sorting cycles and the energy consumption are significantly reduced by our method.

[1]  Chen Hongyan,et al.  Research and implementation of database high performance sorting algorithm with big data , 2017, 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)(.

[2]  Shih-Chun Lin,et al.  Implementation of a High-Throughput Modified Merge Sort in MIMO Detection Systems , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.

[3]  Noorbakhsh Amiri Golilarz,et al.  Optimized Wavelet-Based Satellite Image De-Noising With Multi-Population Differential Evolution-Assisted Harris Hawks Optimization Algorithm , 2020, IEEE Access.

[4]  Shih-Hsiang Lin,et al.  Hardware Design of Low-Power High-Throughput Sorting Unit , 2017, IEEE Transactions on Computers.

[5]  Dinesh Manocha,et al.  GPUTeraSort: high performance graphics co-processor sorting for large database management , 2006, SIGMOD Conference.

[6]  Ann Gordon-Ross,et al.  An Efficient O( $N$ ) Comparison-Free Sorting Algorithm , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[7]  Chun-Yen Chen,et al.  A Two-Directional BigData Sorting Architecture on FPGAs , 2020, IEEE Computer Architecture Letters.

[8]  Weijun Li,et al.  A Hybrid Pipelined Architecture for High Performance Top-K Sorting on FPGA , 2020, IEEE Transactions on Circuits and Systems II: Express Briefs.

[9]  Matthias Kuba,et al.  An FPGA-Based Fully Synchronized Design of a Bilateral Filter for Real-Time Image Denoising , 2014, IEEE Transactions on Industrial Electronics.

[10]  Jin-Tai Yan,et al.  An improved optimal algorithm for bubble-sorting-basednon-Manhattan channel routing , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[11]  Tong Zhang,et al.  Relaxed $K$ -Best MIMO Signal Detector Design and VLSI Implementation , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[12]  Bin Deng,et al.  Real-Time Neuromorphic System for Large-Scale Conductance-Based Spiking Neural Networks , 2019, IEEE Transactions on Cybernetics.

[13]  Giorgos Dimitrakopoulos,et al.  Sorter Based Permutation Units for Media-Enhanced Microprocessors , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[14]  Goetz Graefe,et al.  Implementing sorting in database systems , 2006, CSUR.

[15]  Amin Farmahini Farahani,et al.  Modular Design of High-Throughput, Low-Latency Sorting Units , 2013, IEEE Transactions on Computers.

[16]  Yin-Te Tsai,et al.  An efficient external sorting algorithm , 2000, Inf. Process. Lett..

[17]  Markus Püschel,et al.  Streaming Sorting Networks , 2016, TODE.

[18]  Larissa Njejimana,et al.  Design of a Real-Time FPGA-Based Data Acquisition Architecture for the LabPET II: An APD-Based Scanner Dedicated to Small Animal PET Imaging , 2012, IEEE Transactions on Nuclear Science.

[19]  Mojtaba Mahdavi,et al.  Novel MIMO Detection Algorithm for High-Order Constellations in the Complex Domain , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[20]  Shengnan Dong,et al.  A Novel High-Speed Parallel Scheme for Data Sorting Algorithm Based on FPGA , 2009, 2009 2nd International Congress on Image and Signal Processing.

[21]  Toshio Nakatani,et al.  AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[22]  P. Glenn Gulak,et al.  A 675 Mbps, 4 × 4 64-QAM K-Best MIMO Detector in 0.13 µm CMOS , 2012, IEEE Trans. Very Large Scale Integr. Syst..

[23]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[24]  Bin Deng,et al.  Scalable Digital Neuromorphic Architecture for Large-Scale Biophysically Meaningful Neural Network With Multi-Compartment Neurons , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Ryan Kastner,et al.  Resolve: Generation of High-Performance Sorting Architectures from High-Level Synthesis , 2016, FPGA.

[26]  Mingguo Zhao,et al.  A hybrid and scalable brain-inspired robotic platform , 2020, Scientific Reports.