Integrating a New Cluster Assignment and Scheduling Algorithm into an Experimental Retargetable Code Generation Framework

This paper presents a new unified algorithm for cluster assignment and region scheduling, and its integration into an experimental retargetable code generation framework. The components of the framework are an instruction selector generator based on a recent technique, the IMPACT front end, a machine description module which uses a modification of the HMDES machine description language to include cluster information, a combined cluster allocator and an acyclic region scheduler, and a register allocator. Experiments have been carried out on the targeting of the tool to the Texas Instruments TMS320c62x architecture. We report preliminary results on a set of TI benchmarks.

[1]  Hansoo Kim,et al.  Region-based Register Allocation for EPIC Architectures , 2000 .

[2]  Maria Freericks,et al.  The nml machine description formalism , 1991 .

[3]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[4]  Giuseppe Desoli,et al.  Instruction Assignment for Clustered VLIW DSP Compilers: A New Approach , 1998 .

[5]  B. Ramakrishna Rau,et al.  Elcor's Machine Description System: Version 3.0 , 1998 .

[6]  Alfred V. Aho,et al.  Principles of Compiler Design , 1977 .

[7]  Gregory J. Chaitin,et al.  Register allocation and spilling via graph coloring , 2004, SIGP.

[8]  Vasanth Bala,et al.  A limit study of local memory requirements using value reuse profiles , 1995, MICRO 28.

[9]  Junqiang Sun,et al.  Tms320c6000 cpu and instruction set reference guide , 2000 .

[10]  Maya Madhavan,et al.  A new algorithm for linear regular tree pattern matching , 2000, Theor. Comput. Sci..

[11]  Vivek Sarkar,et al.  Linear scan register allocation , 1999, TOPL.

[12]  T. J. Watson,et al.  CARS: A New Code Generation Framework for Clustered ILP Processors , 2001 .

[13]  Kishore N. Menezes,et al.  Wavefront scheduling: path based data representation and scheduling of subgraphs , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[14]  Christopher W. Fraser,et al.  BURG: fast optimal instruction selection and tree parsing , 1992, SIGP.

[15]  Murray Hill,et al.  Yacc: Yet Another Compiler-Compiler , 1978 .

[16]  Kemal Ebcioglu,et al.  CARS: a new code generation framework for clustered ILP processors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[17]  Geoffrey Brown,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, ISCA '00.

[18]  John R. Ellis,et al.  Bulldog: A Compiler for VLIW Architectures , 1986 .

[19]  Y. N. Srikant,et al.  Integrated temporal and spatial scheduling for extended operand clustered VLIW processors , 2004, CF '04.

[20]  Maya Madhavan,et al.  Extending Graham-Glanville techniques for optimal code generation , 2000, TOPL.

[21]  B. Ramakrishna Rau,et al.  EPIC: Explicititly Parallel Instruction Computing , 2000, Computer.

[22]  Rainer Leupers,et al.  Retargetable Code Generation Based on Structural Processor Description , 1998, Des. Autom. Embed. Syst..

[23]  Andrew W. Appel,et al.  Optimal spilling for CISC machines with few registers , 2001, PLDI '01.

[24]  John L. Hennessy,et al.  The priority-based coloring approach to register allocation , 1990, TOPL.

[25]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[26]  Christopher W. Fraser,et al.  A Retargetable C Compiler: Design and Implementation , 1995 .

[27]  Z. Greenfield,et al.  The TigerSHARC DSP Architecture , 2000, IEEE Micro.

[28]  Vicki H. Allan,et al.  Software pipelining , 1995, CSUR.

[29]  Pierre G. Paulin,et al.  CodeSyn : A Retargetable Code Synthesis System , 1997 .

[30]  Scott A. Mahlke,et al.  The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.

[31]  Silvina Hanono,et al.  AVIV: a retargetable code generator for embedded processors , 1999 .

[32]  Keith D. Cooper,et al.  Improvements to graph coloring register allocation , 1994, TOPL.

[33]  Rajiv Gupta,et al.  Efficient register allocation via coloring using clique separators , 1994, TOPL.

[34]  Scott A. Mahlke,et al.  Region-based hierarchical operation partitioning for multicluster processors , 2003, PLDI '03.

[35]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[36]  B. Ramakrishna Rau,et al.  HMDES Version 2.0 Specification , 1996 .

[37]  Christopher W. Fraser,et al.  Detecting pipeline structural hazards quickly , 1994, POPL '94.

[38]  Nikil D. Dutt,et al.  Partitioned register files for VLIWs: a preliminary analysis of tradeoffs , 1992, MICRO 25.

[39]  Gustavo de Veciana,et al.  Cluster assignment for high-performance embedded VLIW processors , 2002, TODE.

[40]  Rainer Leupers,et al.  Instruction scheduling for clustered VLIW DSPs , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).

[41]  Helmut Emmelmann,et al.  BEG: a generator for efficient back ends , 1989, PLDI '89.

[42]  Gert Goossens,et al.  Chess: retargetable code generation for embedded DSP processors , 1994, Code Generation for Embedded Processors.

[43]  Rainer Leupers,et al.  Retargetable generation of code selectors from HDL processor models , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[44]  Alfred V. Aho,et al.  Principles of Compiler Design (Addison-Wesley series in computer science and information processing) , 1977 .

[45]  Thomas M. Conte,et al.  Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[46]  Mark Stephenson,et al.  Convergent scheduling , 2002, MICRO 35.

[47]  Alfred V. Aho,et al.  Code generation using tree matching and dynamic programming , 1989, ACM Trans. Program. Lang. Syst..

[48]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[49]  Woody Lichtenstein,et al.  The multiflow trace scheduling compiler , 1993, The Journal of Supercomputing.

[50]  Scott A. Mahlke,et al.  Trimaran: An Infrastructure for Research in Instruction-Level Parallelism , 2004, LCPC.