An ASIC design and formal analysis of a novel pipelined and parallel sorting accelerator

Abstract Sorting, which is widely used in different areas such as database systems, IP routing, bio informatics, and cognitive-processing-based applications, imposes considerable overhead on computing resources. Therefore, an efficient on-chip sorting accelerator may significantly enhance real-time decision-making in such applications. In this paper we introduce a novel pipelined and parallel sorting algorithm with streaming I/O, with the time, logic, and memory complexity of O ( n ) , O ( n ) , and O ( n ) , respectively. We present a formal analysis to prove the correctness of this algorithm. We then model, verify, and synthesize this unconditional algorithm (in the TSMC 0.13 micron technology) for 4k-word clusters as an ASIC accelerating engine. More specifically, our implementation with 3969-word multiple-bank memory, 63 word-size comparators, 64 word-size multiplexers, and 63 word-size registers only requires some 8k clock cycles to sort an arbitrary 3969-word long array of random data, which arrive at the sorter and also depart it one item at a time.

[1]  Stephan Olariu,et al.  An Optimal Hardware-Algorithm for Sorting Using a Fixed-Size Parallel Sorting Device , 2000, IEEE Trans. Computers.

[2]  Donald E. Knuth,et al.  Sorting and Searching , 1973 .

[3]  Harold S. Stone,et al.  Parallel Processing with the Perfect Shuffle , 1971, IEEE Transactions on Computers.

[4]  J. Wrench Table errata: The art of computer programming, Vol. 2: Seminumerical algorithms (Addison-Wesley, Reading, Mass., 1969) by Donald E. Knuth , 1970 .

[5]  D. T. LEE,et al.  An On-Chip Compare/Steer Bubble Sorter , 1981, IEEE Transactions on Computers.

[6]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[7]  Manoj Kumar,et al.  An Efficient Implementation of Batcher's Odd-Even Merge Algorithm and Its Application in Parallel Sorting Schemes , 1983, IEEE Transactions on Computers.

[8]  Gérard M. Baudet,et al.  Optimal Sorting Algorithms for Parallel Computers , 1978, IEEE Transactions on Computers.

[9]  David J. DeWitt,et al.  A taxonomy of parallel sorting , 1984, CSUR.

[10]  A. Yavuz Oruç,et al.  Adaptive Binary Sorting Schemes and Associated Interconnection Networks , 1994, IEEE Trans. Parallel Distributed Syst..

[11]  Kenneth E. Batcher,et al.  Minimizing Communication in the Bitonic Sort , 2000, IEEE Trans. Parallel Distributed Syst..

[12]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[13]  H. T. Kung,et al.  A tree machine for searching problems , 1979 .

[14]  Jae-Dong Lee Design of General -Purpose Bitonic Sorting Algorithms with a Fixed Number of Processors for Shared-Memory Parallel Computers , 1999 .

[15]  Stephan Olariu,et al.  How to Sort N Items Using a Sorting Network of Fixed I/O Size , 1999, IEEE Trans. Parallel Distributed Syst..