A High-Performance and Cost-Effective Hardware Merge Sorter without Feedback Datapath

We propose a high-performance and cost-effective hardware merge sorter (HMS) without any feedback datapaths in order to develop the fastest hardware sorting accelerator. The operating frequencies of existing HMSs are severely limited by the presence of feedback datapaths. We show the idea of eliminating the feedback datapaths, and propose a concrete architecture adopting the idea and some implementation optimizations. The evaluation results show that our HMS achieves 1.59x throughput improvement with less hardware resources compared to the state-of-the-art HMS.

[1]  A. Grimshaw,et al.  High Performance and Scalable Radix Sorting: a Case Study of Implementing Dynamic Parallelism for GPU Computing , 2011, Parallel Process. Lett..

[2]  Kenji Kise,et al.  High-Performance Hardware Merge Sorter , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[3]  Daniel Brand,et al.  PARADIS: An Efficient Parallel Algorithm for In-place Radix Sort , 2015, Proc. VLDB Endow..

[4]  Andrew A. Davidson,et al.  Efficient parallel merge sort for fixed and variable length keys , 2012, 2012 Innovative Parallel Computing (InPar).

[5]  Hiroshi Inoue,et al.  SIMD- and Cache-Friendly Algorithm for Sorting an Array of Structures , 2015, Proc. VLDB Endow..

[6]  Kunle Olukotun,et al.  Hardware acceleration of database operations , 2014, FPGA.

[7]  Jim D. Garside,et al.  Parallel Hardware Merge Sorter , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[8]  Jim Tørresen,et al.  FPGASort: a high performance sorting architecture exploiting run-time reconfiguration on fpgas for large problem sorting , 2011, FPGA '11.

[9]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.