Flexible hardware acceleration for multimedia oriented microprocessors

The execution of multimedia applications on a microprocessor greatly benefits from hardware acceleration, both in terms of speed and energy consumption. While the basic functionality implemented in these accelerators remains constant over different product versions, small changes are still often required. With the proposed architecture and protocol, the accelerator hardware has the performance and cost benefits of a hardwired solution, while featuring all the flexibility needed in practice. From a user point of view, the entire application is still programmable.

[1]  Carl Ebeling,et al.  Specifying and compiling applications for RaPiD , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[2]  Frank Vahid,et al.  The case for a configure-and-execute paradigm , 1999, Proceedings of the Seventh International Workshop on Hardware/Software Codesign (CODES'99) (IEEE Cat. No.99TH8450).

[3]  W.F.J. Verhaegh,et al.  Allocation of multiport memories for hierarchical data streams , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[4]  Hugo De Man,et al.  Formalized three-layer system-level model and reuse methodology for embedded data-dominated applications , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[5]  John Wawrzynek,et al.  Reconfigurable computing: what, why, and implications for design automation , 1999, DAC '99.

[6]  Hugo De Man,et al.  System-level transformations for low power data transfer and storage , 1998 .

[7]  Hugo De Man,et al.  Extended design reuse trade-offs in hardware-software architecture mapping , 2000, CODES '00.

[8]  Nikil D. Dutt,et al.  Data cache sizing for embedded processor applications , 1998, Proceedings Design, Automation and Test in Europe.

[9]  François Bodin,et al.  Improving cache behavior of dynamically allocated data structures , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[10]  Carl Ebeling,et al.  The Triptych FPGA architecture , 1995, IEEE Trans. Very Large Scale Integr. Syst..

[11]  Mahmut T. Kandemir,et al.  Changing Interaction of Compiler and Architecture , 1997, Computer.

[12]  Anantha P. Chandrakasan,et al.  Low-Power CMOS Design , 1997 .

[13]  Richard T. Witek,et al.  A 160 MHz 32 b 0.5 W CMOS RISC microprocessor , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[14]  Herman Schmit,et al.  Synthesis of application-specific memory designs , 1997, IEEE Trans. Very Large Scale Integr. Syst..

[15]  Daniel Gajski,et al.  A memory selection algorithm for high-performance pipelines , 1995, Proceedings of EURO-DAC. European Design Automation Conference.

[16]  Margaret Martonosi,et al.  Characterizing the Memory Behavior of Compiler-Parallelized Applications , 1996, IEEE Trans. Parallel Distributed Syst..

[17]  Donald E. Thomas,et al.  Instruction subsetting: Trading power for programmability , 1998, Proceedings IEEE Computer Society Workshop on VLSI'98 System Level Design (Cat. No.98EX158).