3D-Sorter: 3D Design of a Resource-Aware Hardware Sorter for Edge Computing Platforms Under Area and Energy Consumption Constraints

In this paper, we proposed a 3-dimensional hardware sorting architecture (3D-Sorter), based on MultiDimensional Sorting Algorithm (MDSA). the proposed architecture transforms a sequence of input records into a 3-dimensional matrix. Records of every dimension are sorted in several MDSA phases, using partial sorting methods. Our synthesis results, provided by Xilinx Vivado indicate that the 3D-Sorter design decreases the number of Look-Up Tables (LUT) and registers by 54% and 42.7%, compared to the state-of-the-art hardware sorter. Also, the power consumption is reduced by 48.15% on average. The results show that the proposed architecture is a remarkable power/area saving for edge components.

[1]  David C. Hendry Comparator trees for winner-take-all circuits , 2004, Neurocomputing.

[2]  Takeo Kanade,et al.  A VLSI sorting image sensor: global massively parallel intensity-to-time processing for low-latency adaptive vision , 1999, IEEE Trans. Robotics Autom..

[3]  Qingtian Zeng,et al.  Efficient Sorting Architecture for List-Fast-SSC Decoding of Polar Codes , 2018, IEEE Access.

[4]  Ahmad Patooghy,et al.  PAT-Noxim: A Precise Power & Thermal Cycle-Accurate NoC Simulator , 2018, 2018 31st IEEE International System-on-Chip Conference (SOCC).

[5]  Kia Bazargan,et al.  Low-Cost Sorting Network Circuits Using Unary Processing , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Wei Wu,et al.  Parallel Sorting by Approximate Splitting for Multi-core Processors , 2010, 2010 Third International Joint Conference on Computational Science and Optimization.

[7]  Kenji Kise,et al.  High-Performance Hardware Merge Sorter , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[8]  Bruce Jacob,et al.  Hardware support for real-time operating systems , 2003, First IEEE/ACM/IFIP International Conference on Hardware/ Software Codesign and Systems Synthesis (IEEE Cat. No.03TH8721).

[9]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[10]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[11]  Hakem Beitollahi,et al.  RTHS: A Low-Cost High-Performance Real-Time Hardware Sorter, Using a Multidimensional Sorting Algorithm , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[12]  Liang-Gee Chen,et al.  Parallel global elimination algorithm and architecture design for fast block matching motion estimation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Gustavo Alonso,et al.  Sorting networks on FPGAs , 2012, The VLDB Journal.

[14]  In-Cheol Park,et al.  Efficient Sorting Architecture for Successive-Cancellation-List Decoding of Polar Codes , 2016, IEEE Transactions on Circuits and Systems II: Express Briefs.

[15]  Jongkil Park,et al.  A 128-Channel FPGA-Based Real-Time Spike-Sorting Bidirectional Closed-Loop Neural Interface System , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[16]  Liang Liu,et al.  Energy Efficient Group-Sort QRD Processor With On-Line Update for MIMO Channel Pre-Processing , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.

[17]  Nacho Navarro,et al.  Comparison based sorting for systems with multiple GPUs , 2013, GPGPU@ASPLOS.

[18]  Aparna Suresh,et al.  Performance Analysis of Various Combination Sorting Algotirthms for Large Dataset to fit to a Multi-Core Architecture , 2018, 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT).

[19]  Amin Farmahini Farahani,et al.  Modular Design of High-Throughput, Low-Latency Sorting Units , 2013, IEEE Transactions on Computers.

[20]  Suriayati Chuprat,et al.  Zero-delay FPGA-based odd-even sorting network , 2013, 2013 IEEE Symposium on Computers & Informatics (ISCI).

[21]  Satoshi Matsuoka,et al.  GPU-Accelerated Large-Scale Distributed Sorting Coping with Device Memory Capacity , 2016, IEEE Transactions on Big Data.

[22]  Valery Sklyarov,et al.  High-performance implementation of regular and easily scalable sorting networks on an FPGA , 2014, Microprocess. Microsystems.

[23]  Valery Sklyarov,et al.  Performance evaluation for FPGA-based processing of tree-like structures , 2012, 2012 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS 2012).

[24]  Norbert Luttenberger,et al.  A Novel Sorting Algorithm for Many-core Architectures Based on Adaptive Bitonic Sort , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.