ACT : A Low Power VLIW Cluster Coprocessor for DSP Applications

The ACT (Adaptive Cellular Telephony) coprocessor architecture is described and analyzed using a set of widely used DSP algorithms. Performance and power are compared to equivalent implementations on ASIC and embedded processor platforms. Flexibility is achieved by fine-grain program control of communication and execution resources. Compression techniques, simple addressing modes for large single-ported distributed register files, and configurable address generation units provide performance and energy efficiency. An energy-delay reduction of two to three orders of magnitude is achieved when compared to a conventional embedded processor such as the Intel XScale.

[1]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[2]  Luca Benini,et al.  Minimizing memory access energy in embedded systems by selective instruction compression , 2002, IEEE Trans. Very Large Scale Integr. Syst..

[3]  A. Davis,et al.  ENERGY EFFICIENT CLUSTER COPROCESSORS , 2022 .

[4]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[5]  Jari Nurmi,et al.  A flexible RAKE receiver architecture for WCDMA mobile terminals , 2001, 2001 IEEE Third Workshop on Signal Processing Advances in Wireless Communications (SPAWC'01). Workshop Proceedings (Cat. No.01EX471).

[6]  Edward A. Lee,et al.  DSP Processor Fundamentals: Architectures and Features , 1997 .

[7]  Jörg Henkel,et al.  Code compression for low power embedded system design , 2000, Proceedings 37th Design Automation Conference.

[8]  Rakesh Krishnaiyer,et al.  Optimizing software data prefetches with rotating registers , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[9]  Vittorio Zaccaria,et al.  Exploiting data forwarding to reduce the power budget of VLIW embedded processors , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[10]  I. A. Gerson,et al.  Vector sum excited linear prediction (VSELP) speech coding at 8 kbps , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[11]  Masoud Salehi,et al.  Performance analysis of turbo decoder for 3GPP standard using the sliding window algorithm , 2001, 12th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications. PIMRC 2001. Proceedings (Cat. No.01TH8598).

[12]  Jan M. Rabaey,et al.  Evaluation of a Low-Power Reconfigurable DSP Architecture , 1998, IPPS/SPDP Workshops.

[13]  M. Horowitz,et al.  Energy dissipation in general purpose processors , 1995, 1995 IEEE Symposium on Low Power Electronics. Digest of Technical Papers.

[14]  William J. Dally,et al.  A bandwidth-efficient architecture for media processing , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.