Automatic design of arbitrary-size approximate sorting networks with error guarantee

Despite the fact that hardware sorters offer great performance, they become expensive as the number of inputs increases. In order to address the problem of high-performance and power-efficient computing, we propose a scalable method for construction of power-efficient sorting networks suitable for hardware implementation. The proposed approach exploits the error resilience which is present in many real-world applications such as digital signal processing, biological computing and large-scale scientific computing. The method is based on recursive construction of large sorting networks using smaller instances of approximate sorting networks. The design process is tunable and enables to achieve desired tradeoffs between the accuracy and power consumption or implementation cost. A search-based design method is used to obtain approximate sorting networks. To measure and analyze the accuracy of approximate networks, three data-independent quality metrics are proposed. Namely, guarantee of error probability, worst-case error and error distribution are discussed. A significant improvement in the implementation cost and power consumption was obtained. For example, 20% reduction in power consumption was achieved by introducing a small error in 256-input sorter. The difference in rank is proved to be not worse than 2 with probability at least 99%. In addition to that, it is guaranteed that the worst-case difference is equal to 6.

[1]  Lukás Sekanina,et al.  Evolutionary Approach to Approximate Digital Circuits Design , 2015, IEEE Transactions on Evolutionary Computation.

[2]  C. Greg Plaxton,et al.  A (fairly) simple circuit that (usually) sorts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[3]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[4]  Julian Francis Miller,et al.  Cartesian genetic programming , 2000, GECCO '10.

[5]  Markus Püschel,et al.  Computer generation of streaming sorting networks , 2012, DAC Design Automation Conference 2012.

[6]  Chang-Jun Ahn,et al.  Sorter-Based Arithmetic Circuits for Sigma-Delta Domain Signal Processing—Part I: Addition, Approximate Transcendental Functions, and Log-Domain Operations , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[7]  Hongjun Lu,et al.  Approximate processing of massive continuous quantile queries over high-speed data streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[8]  Zdenek Vasícek,et al.  Trading between quality and non-functional properties of median filter in embedded systems , 2017, Genetic Programming and Evolvable Machines.

[9]  Michael Frank,et al.  Twenty-Five Comparators Is Optimal When Sorting Nine Inputs (and Twenty-Nine for Ten) , 2014, 2014 IEEE 26th International Conference on Tools with Artificial Intelligence.

[10]  Jakub Závodný,et al.  Optimal Sorting Networks , 2013, LATA.

[11]  Kenneth E. Batcher,et al.  Minimizing Communication in the Bitonic Sort , 2000, IEEE Trans. Parallel Distributed Syst..

[12]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[13]  Gustavo Alonso,et al.  Sorting networks on FPGAs , 2012, The VLDB Journal.

[14]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[15]  Mike Müller,et al.  New Bounds on Optimal Sorting Networks , 2015, CiE.

[16]  Harold S. Stone,et al.  Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[17]  Viktor K. Prasanna,et al.  Energy and Memory Efficient Mapping of Bitonic Sorting on FPGA , 2015, FPGA.

[18]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[19]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[20]  Bruce G. Lindsay,et al.  Approximate medians and other quantiles in one pass and with limited memory , 1998, SIGMOD '98.