High-level synthesis with SIMD units

This paper presents novel techniques to integrate the use of Single Instruction Multiple Data (SIMD) functional units in a high-level synthesis (HLS) design methodology. SIMD functional units can be configured to operate in one or more SIMD modes, in which they process multiple sets of smaller bitwidth operands in parallel. Conceptually, the use of SIMD functional units enables HLS to (i) exploit parallelism to a higher degree without using additional resources, (ii) improve resource utilization by enabling hardware re-use at a fine-grained level, and (iii) improve energy efficiency for a given area and/or performance constraint. We illustrate the issues involved in performing high-level synthesis with SIMD functional units, and discuss how algorithms involved in a typical high-level synthesis flow can be enhanced to result in maximal performance and energy improvements. These techniques are not restricted to specific high-level synthesis tools/algorithms, and can be plugged into any generic high-level synthesis system.

[1]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[2]  James F. Blinn Fugue for MMX [parallel programming] , 1997 .

[3]  Gary Goldman,et al.  UltraSPARC-II: the advancement of ultracomputing , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.

[4]  Wei Ding,et al.  VIS-based native video processing on UltraSPARC , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[5]  Sujit Dey,et al.  Performance Analysis and Optimization of Schedules for Conditional and Loop-Intensive Specifications , 1994, 31st Design Automation Conference.

[6]  Leslie Kohn,et al.  MPEG video decoding with the UltraSPARC visual instruction set , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.

[7]  Jim Blinn Chapter Seven – Fugue for MMX: MARCH-APRIL 1997 , 2003 .

[8]  Sujit Dey,et al.  High-Level Power Analysis and Optimization , 1997 .

[9]  Miodrag Potkonjak,et al.  Behavioral synthesis optimization using multiple precision arithmetic , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  Miodrag Potkonjak,et al.  Low-power behavioral synthesis optimization using multiple precision arithmetic , 1999, DAC '99.

[11]  Ruby B. Lee,et al.  64-bit and multimedia extensions in the PA-RISC 2.0 architecture , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.

[12]  Naofumi Takagi A Multiple-Precision Modular Multiplication Algorithm with Triangle Additions , 1995, IEICE Trans. Inf. Syst..

[13]  Uri C. Weiser,et al.  MMX technology extension to the Intel architecture , 1996, IEEE Micro.

[14]  Mahesh Mehendale,et al.  High level synthesis of multi-precision data flow graphs , 2001, VLSI Design 2001. Fourteenth International Conference on VLSI Design.