CASCADE - configurable and scalable DSP environment

As the complexity of embedded systems grows rapidly, it is common to accelerate critical tasks with hardware. Designers usually use off-the-shelf components or licensed IP cores to shorten the time to market, but the hardware/software interfacing is tedious, error-prone and usually not portable. Besides, the existing hardware seldom matches the requirements perfectly, CASCADE, the proposed design environment as an alternative, generates coprocessing datapaths from the executing algorithms specified in C/C++ and attaches these datapaths to the embedded processor with an auto-generated software driver. The number of datapaths and their internal parallel functional units are scaled to fit the application. It seamlessly integrates the design tools of the embedded processor to reduce the re-training/design efforts and maintains short product development time as the pure software approaches. A JPEG encoder is built in CASCADE successfully with an auto-generated four-MAC accelerator to achieve 623% performance boost for our video application.

[1]  Joan L. Mitchell,et al.  JPEG: Still Image Data Compression Standard , 1992 .

[2]  Ricardo E. Gonzalez,et al.  Xtensa: A Configurable and Extensible Processor , 2000, IEEE Micro.

[3]  H. T. Nguyen,et al.  Number-splitting with shift-and-add decomposition for power and hardware optimization in linear DSP synthesis , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[4]  B. Ramakrishna Rau,et al.  Embedded Computer Architecture and Automation , 2001, Computer.

[5]  Mark Stephenson,et al.  Bidwidth analysis with application to silicon compilation , 2000, PLDI '00.

[6]  Scott Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.

[7]  Chein-Wei Jen,et al.  Data stream generation for concurrent computation in VLSI signal processors , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[8]  Trudy D. Stetzler,et al.  DSP-based architectures for mobile communications: past, present and future , 2000, IEEE Commun. Mag..

[9]  Keshab K. Parhi,et al.  Design of data format converters using two-dimensional register allocation , 1998 .

[10]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[11]  Giovanni De Micheli,et al.  Hardware-software cosynthesis for digital systems , 1993, IEEE Design & Test of Computers.

[12]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[13]  Daniel D. Gajski,et al.  High ― Level Synthesis: Introduction to Chip and System Design , 1992 .

[14]  Grant Martin,et al.  Surviving the SOC Revolution: A Guide to Platform-Based Design , 1999 .

[15]  Giovanni De Micheli,et al.  Hardware-software Co-synthesis for Digital Systems , 2001 .

[16]  Lori E. Lucke,et al.  Low power data format converter design using semi-static register allocation , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[17]  Rolf Ernst,et al.  Scalable performance scheduling for hardware-software cosynthesis , 1995, Proceedings of EURO-DAC. European Design Automation Conference.