DSS: Applying asynchronous techniques to architectures exploiting ILP at compile time

Embedded application environments require both high performance and low power. Architectures exploiting instruction-level parallelism (ILP) at compile time, such as very long instruction word (VLIW) and transport triggered architecture (TTA), may satisfy the requirements. They can be further enhanced by using asynchronous circuits to significantly reduce power consumption. As such, we are interested in asynchronous processors with architectures exploiting ILP at compile time. However, most of the current asynchronous processors are based on RISC-like architectures. When designing asynchronous VLIW or TTA processors, the distribution of control introduces some serious problems, and errors may occur because of the variable latencies of operations. This paper investigates the asynchronous processor with architecture exploiting ILP at compile time. In order to overcome these problems, we propose a data source selecting (DSS) scheme to guarantee instructions run correctly on asynchronous VLIW and TTA processors. Concretely, an asynchronous pipelined processor based on TTA is designed. The micro-architecture of the proposed asynchronous TTA processor is presented and an asynchronous processor named Tengyue is implemented using 180nm technology. The experimental results, for a range of benchmarks and working modes, show that the implemented asynchronous TTA processor with DSS scheme support runs correctly and power dissipation is reduced to about 43% to 65% of the equivalent synchronous processor.

[1]  Lai Mingche,et al.  Using an Automated Approach to Explore and Design a High-Efficiency Processor Element for the Multimedia Domain , 2008, 2008 International Conference on Complex, Intelligent and Software Intensive Systems.

[2]  Pong-Fei Lu,et al.  Physical design of a fourth-generation POWER GHz microprocessor , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[3]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[4]  Alan Jay Smith,et al.  Measuring the Performance of Multimedia Instruction Sets , 2002, IEEE Trans. Computers.

[5]  M. Rupp,et al.  Power estimation methodology for VLIW Digital Signal Processors , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[6]  Luigi Carro,et al.  A VLIW low power Java processor for embedded applications , 2004, Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784).

[7]  Takashi Nanya,et al.  TITAC: design of a quasi-delay-insensitive microprocessor , 1994, IEEE Design & Test of Computers.

[8]  Richard York,et al.  ARM996HS: The First Licensable, Clockless 32-Bit Processor Core , 2007, IEEE Micro.

[9]  Viera Stopjaková,et al.  Automated Synchronous-to-Asynchronous Circuits Conversion: A Survey , 2008, PATMOS.

[10]  Jim D. Garside,et al.  AMULET1: A Asynchronous ARM Microprocessor , 1997, IEEE Trans. Computers.

[11]  Michael Franz,et al.  Power reduction techniques for microprocessor systems , 2005, CSUR.

[12]  Chiu-Sing Choy,et al.  A fine-grain asynchronous pipeline reaching the synchronous speed , 2001, ASICON 2001. 2001 4th International Conference on ASIC Proceedings (Cat. No.01TH8549).

[13]  Luciano Lavagno,et al.  Desynchronization: Synthesis of Asynchronous Circuits From Synchronous Specifications , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[14]  Eby G. Friedman,et al.  Clock distribution networks in synchronous digital integrated circuits , 2001, Proc. IEEE.

[15]  Paul I. Pénzes,et al.  The design of an asynchronous MIPS R3000 microprocessor , 1997, Proceedings Seventeenth Conference on Advanced Research in VLSI.

[16]  Luciano Lavagno,et al.  A Fully-Automated Desynchronization Flow for Synchronous Circuits , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[17]  Kunihiro Asada,et al.  Design of a 32-bit fully asynchronous microprocessor (FAM) , 1992, [1992] Proceedings of the 35th Midwest Symposium on Circuits and Systems.

[18]  Erik Brunvand The NSR processor , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[19]  Meng-Chou Chang,et al.  Design of an asynchronous pipelined processor , 2008, 2008 International Conference on Communications, Circuits and Systems.

[20]  Larry L. Biro,et al.  Power considerations in the design of the Alpha 21264 microprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[21]  Vivek Tiwari,et al.  Reducing power in high-performance microprocessors , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[22]  Panu Hämäläinen,et al.  Design of transport triggered architecture processors for wireless encryption , 2005, 8th Euromicro Conference on Digital System Design (DSD'05).

[23]  Steven M. Burns,et al.  The design of an asynchronous microprocessor , 1989, CARN.