The QC-2 parallel Queue processor architecture

Queue based instruction set architecture processor offers an attractive option in the design of embedded systems. In our previous work, we proposed a novel queue processor architecture as a starting point for hardware/software design space exploration for embedded applications. In this paper, we present a high performance 32-bit Synthesizable QueueCore (QC-2)-an improved and optimized version of the produced order parallel Queue processor (PQP), with single precision floating-point support. The QC-2 core also implements a novel technique used to extend immediate values and memory instruction offsets that were otherwise not representable because of bit-width constraints in the PQP processor. A prototype implementation is produced by synthesizing the high-level model for a target FPGA device. We present the architecture description and design results in a fair amount of details.

[1]  Vivek Tiwari,et al.  Reducing power in high-performance microprocessors , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[2]  Anantha Chandrakasan Ultra low power digital signal processing , 1996, Proceedings of 9th International Conference on VLSI Design.

[3]  Hideo Maejima,et al.  Design and Architecture for Low-power/High-Speed RISC Microprocessor: SuperH , 1997 .

[4]  Randal E. Bryant,et al.  Formal verification of an ARM processor , 1999, Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013).

[5]  Larry L. Biro,et al.  Power considerations in the design of the Alpha 21264 microprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[6]  Tsutomu Yoshinaga,et al.  A Reduced Bit - Width Instruction Set Architecture for FQM Execution in Hybrid Processor Architecture(FaRM - rq) , 2003 .

[7]  Arquimedes Canedo,et al.  A new code generation algorithm for 2-offset producer order queue computation model , 2008, Comput. Lang. Syst. Struct..

[8]  Lenwood S. Heath,et al.  Stack and Queue Layouts of Directed Acyclic Graphs: Part I , 1999, SIAM J. Comput..

[9]  H. Takahashi A 100 MIPS High Speed and Low Power Digital Signal Processor , 1997 .

[10]  Norio Nakagawa,et al.  Functional verification of the superscalar SH-4 microprocessor , 1997, Proceedings IEEE COMPCON 97. Digest of Papers.

[11]  Liam Goudge,et al.  Thumb: reducing the cost of 32-bit RISC performance in portable and consumer applications , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.

[12]  Vaughn Betz,et al.  The Stratix II logic and routing architecture , 2005, FPGA '05.

[13]  T. Yoshinaga,et al.  Queue processor architecture for novel queue computing paradigm based on produced order scheme , 2004, Proceedings. Seventh International Conference on High Performance Computing and Grid in Asia Pacific Region, 2004..

[14]  J. B. Gosling Tutorial on proposed IEEE standard P754 on binary floating-point arithmetic , 1981 .

[15]  Herman Schmit,et al.  Queue machines: hardware compilation in hardware , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[16]  Tsutomu Yoshinaga,et al.  Parallel Queue Processor Architecture Based on Produced Order Computation Model , 2005, The Journal of Supercomputing.

[17]  Arquimedes Canedo,et al.  A GCC-based Compiler for the Queue Register Processor (QRP-GCC) , 2006 .

[18]  Alok Sharma,et al.  Estimating Architectural Resources and Performance for High-Level Synthesis Applications , 1993, 30th ACM/IEEE Design Automation Conference.

[19]  Tsutomu Yoshinaga,et al.  Modular Design Structure and High-Level Prototyping for Novel Embedded Processor Core , 2005, EUC.

[20]  Donald B. Alpert,et al.  Architecture of the Pentium microprocessor , 1993, IEEE Micro.

[21]  Kevin D. Kissell MIPS16: High-density MIPS for the Embedded Market1 , 1997 .

[22]  Edwin Hsing-Mean Sha,et al.  Hardware/Software co-design with the HMS framework , 1996, J. VLSI Signal Process..

[23]  Andrew D. Booth,et al.  A SIGNED BINARY MULTIPLICATION TECHNIQUE , 1951 .

[24]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[25]  Tsutomu Yoshinaga,et al.  High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core , 2006, The Journal of Supercomputing.

[26]  Bruno R. Preiss,et al.  Data Flow on a Queue Machine , 1985, ISCA.

[27]  James E. Smith,et al.  The microarchitecture of superscalar processors , 1995, Proc. IEEE.

[28]  David Stevenson A Proposed Standard for Binary Floating-Point Arithmetic , 1981 .

[29]  Giovanni De Micheli,et al.  Readings in hardware / software co-design , 2001 .

[30]  Vaughn Betz,et al.  The stratixπ routing and logic architecture , 2003, FPGA '03.

[31]  Ho-Young Kim,et al.  Top-Down Retargetable Fraemwork with Token-Level Design for Accelerating Simulation Speed of Processor Architecture , 2003 .

[32]  Gerry Kane,et al.  MIPS RISC Architecture , 1987 .

[33]  Bruno R. Preiss,et al.  Data flow on a queue machine , 1985, ISCA 1985.

[34]  Abderazek Ben Abdallah Dynamic instruction issue algorithm and a queue execution model toward the design of hybrid processor architecture , 2002 .

[35]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[36]  Guido D. Salvucci,et al.  Ieee standard for binary floating-point arithmetic , 1985 .

[37]  Tsutomu Yoshinaga,et al.  Queue Processor Architecture for Novel Queue Queue Processor Architecture for Novel Queue Computing Paradigm Based on Produced Order Scheme Computing Paradigm Based on Produced Order Scheme , 2004 .

[38]  J. Llosa,et al.  Using Queues for Register File Organization in VLIW Architectures by Marcio , 1997 .

[39]  Steven E. Shladover Research and development needs for advanced vehicle control systems , 1993, IEEE Micro.

[40]  Benjamin Bishop,et al.  The design of a register renaming unit , 1999, Proceedings Ninth Great Lakes Symposium on VLSI.

[41]  D. Stevenson A Proposed Standard for Binary Floating-Point Arithmetic , 1981, Computer.