Design and implementation of a queue compiler

Queue processors are a viable alternative for high performance embedded computing and parallel processing. We present the design and implementation of a compiler for a queue-based processor. Instructions of a queue processor implicitly reference their operands making the programs free of false dependencies. Compiling for a queue machine differs from traditional compilation methods for register machines. The queue compiler is responsible for scheduling the program in level-order manner to expose natural parallelism and calculating instructions relative offset values to access their operands. This paper describes the phases and data structures used in the queue compiler to compile C programs into assembly code for the QueueCore, an embedded queue processor. Experimental results demonstrate that our compiler produces good code in terms of parallelism and code size when compared to code produced by a traditional compiler for a RISC processor.

[1]  Tsutomu Yoshinaga,et al.  Parallel Queue Processor Architecture Based on Produced Order Computation Model , 2005, The Journal of Supercomputing.

[2]  Arquimedes Canedo,et al.  A GCC-based Compiler for the Queue Register Processor (QRP-GCC) , 2006 .

[3]  Monica S. Lam,et al.  RETROSPECTIVE : Software Pipelining : An Effective Scheduling Technique for VLIW Machines , 1998 .

[4]  Edward S. Davidson,et al.  Evaluating the Use of Register Queues in Software Pipelined Loops , 2001, IEEE Trans. Computers.

[5]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[6]  Sowa Masahiro,et al.  Queue Compiler Development , 2007 .

[7]  Gürhan Küçük,et al.  Energy Efficient Register Renaming , 2003, PATMOS.

[8]  Vikas Agarwal,et al.  Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[9]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[10]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[11]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[12]  Steven Swanson,et al.  The WaveScalar architecture , 2007, TOCS.

[13]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[14]  Ken Kennedy,et al.  Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .

[15]  Lenwood S. Heath,et al.  Stack and Queue Layouts of Directed Acyclic Graphs: Part I , 1999, SIAM J. Comput..

[16]  Gary S. Tyson,et al.  Register queues: a new hardware/software approach to efficient software pipelining , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).

[17]  William A. Wulf,et al.  Evaluation of the WM Architecture , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[18]  Jozo J. Dujmovic,et al.  Evolution and evaluation of SPEC benchmarks , 1998, PERV.

[19]  Masahiro Sowa,et al.  Design and architecture for an embedded 32-bit QueueCore , 2006, J. Embed. Comput..

[20]  Lenwood S. Heath,et al.  Laying out Graphs Using Queues , 1992, SIAM J. Comput..

[21]  Masahiro Sowa,et al.  Design of a superscalar processor based on queue machine computation model , 1999, 1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368).

[22]  Lizy Kurian John,et al.  Scaling to the end of silicon with EDGE architectures , 2004, Computer.

[23]  Gerry Kane,et al.  MIPS RISC Architecture , 1987 .

[24]  Bruno R. Preiss,et al.  Data flow on a queue machine , 1985, ISCA 1985.

[25]  Tsutomu Yoshinaga,et al.  High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core , 2006, The Journal of Supercomputing.

[26]  Arquimedes Canedo,et al.  A new code generation algorithm for 2-offset producer order queue computation model , 2008, Comput. Lang. Syst. Struct..

[27]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[28]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[29]  Huiyang Zhou,et al.  Code size efficiency in global scheduling for ILP processors , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.

[30]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[31]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[32]  Herman Schmit,et al.  Queue machines: hardware compilation in hardware , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[33]  J. Llosa,et al.  Using Queues for Register File Organization in VLIW Architectures by Marcio , 1997 .

[34]  Jason Merrill Generic and gimple: A new tree represen-tation for entire functions , 2003 .