Hardware acceleration of divide-and-conquer paradigms: a case study

The authors describe a method for speeding up divide-and-conquer algorithms with a hardware coprocessor, using sorting as an example. The method employs a conventional processor for the 'divide' and 'merge' phases, while the 'conquer' phase is handled by a purpose-built coprocessor. It is shown how transformation techniques from the Ruby language can be adopted in developing a family of systolic sorters, and how one of the resulting designs is prototyped in eight FPGAs on a PC coprocessor board known as CHS2*4 from Algotronix. The execution of the hardware unit is embedded in a sorting program, with the PC host merging the sorted sequences from the hardware sorter. The performance of this implementation is compared against various sorting algorithms on a number of PC systems.<<ETX>>

[1]  Daniel P. Lopresti,et al.  Building and using a highly parallel programmable logic array , 1991, Computer.

[2]  Barry S. Fagin,et al.  EPGA and rapid prototyping technology use in a special purpose computer for molecular genetics , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[3]  John Gray,et al.  Configurable hardware: Two case studies of micro-grain computation , 1990, J. VLSI Signal Process..

[4]  Geoffrey Brown,et al.  A Systolic LRU Processor and Its Top-Down Development , 1990, Sci. Comput. Program..

[5]  W. W. C. Luk,et al.  Systematic serialisation of array-based architectures , 1993, Integr..

[6]  Mary Sheeran,et al.  Computer-based tools for regular array design , 1990 .

[7]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[8]  P. Gács,et al.  Algorithms , 1992 .

[9]  Jean Vuillemin,et al.  Introduction to programmable active memories , 1990 .

[10]  W. Luk,et al.  The derivation of regular synchronous circuits , 1988, [1988] Proceedings. International Conference on Systolic Arrays.