Code generation algorithms for digital signal processors

The dramatic reduction in the cost of electronic devices combined with large improvements in design productivity due to the use of automatic tools are gradually opening up the possibility for high-performance, very low cost computation. This is reflected in the increasing demand for high-performance portable devices which have low power consumption and cost. The best way to design such systems is to use dedicated architectures. Digital Signal Processors (DSPs) are specialized architectures which can provide these features for applications which require intensive numeric computation, like those running in telecommunication systems. This thesis deals with code generation for DSPs. It addresses two main problems in this area. The first is the generation of code for program basic blocks. The second is the problem of code generation for address computation. This work makes two contributions for the basic block code generation problem. First, it proposes a model for the processor architecture which captures the impact of the datapath design in the code generation algorithm. Based on this model, it proves the existence of an optimal O(n) algorithm for expression tree code generation for a class of DSP architectures. Experimental results are used to confirm the correctness of the algorithm. The contribution of this thesis for the address generation problem is an algorithm for the allocation of array references to address registers, which maximizes the use of the registers and reduces the cost of addressing operations in the program. Experiments are performed which show that the proposed optimization can considerably improve the final addressing code, when compared with code from the best optimizing compiler available for the target DSP.

[1]  Dhananjay M. Dhamdhere,et al.  Efficient Retargetable Code Generation Using Bottom-up Tree Pattern Matching , 1990, Comput. Lang..

[2]  Qunyan Wu Register Allocation via Hierarchical Graph Coloring , 1996 .

[3]  Edward G. Coffman,et al.  Instruction Sets for Evaluating Arithmetic Expressions , 1983, JACM.

[4]  E.A. Lee Programmable DSP architectures. II , 1989, IEEE ASSP Magazine.

[5]  Michael D. Smith,et al.  A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[6]  Alfred V. Aho,et al.  Code generation using tree matching and dynamic programming , 1989, ACM Trans. Program. Lang. Syst..

[7]  Ravi Sethi,et al.  Complete register allocation problems , 1973, SIAM J. Comput..

[8]  Sharad Malik,et al.  Memory bank and register allocation in software synthesis for ASIPs , 1995, ICCAD.

[9]  Christoph M. Hoffmann,et al.  Pattern Matching in Trees , 1982, JACM.

[10]  Christopher W. Fraser,et al.  Engineering a simple, efficient code-generator generator , 1992, LOPL.

[11]  P. Lapsley,et al.  How to estimate DSP processor performance , 1996 .

[12]  J. Davenport Editor , 1960 .

[13]  D. H. Bartley,et al.  Optimizing stack frame accesses for processors with restricted addressing modes , 1992, Softw. Pract. Exp..

[14]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[15]  Ahmed Amine Jerraya,et al.  Address calculation for retargetable compilation and exploration of instruction-set architectures , 1996, DAC '96.

[16]  John L. Bruno,et al.  Code Generation for a One-Register Machine , 1976, J. ACM.

[17]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1991, CACM.

[18]  Ulrich Derigs,et al.  The Chinese Postman Problem , 1980 .

[19]  Peter Marwedel Tree-based mapping of algorithms to predefined structures , 1993, ICCAD.

[20]  Alfred V. Aho,et al.  Code-generation for machines with multiregister operations , 1977, POPL '77.

[21]  T. C. May,et al.  Instruction-set matching and selection for DSP and ASIP code generation , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[22]  B. Wess Automatic instruction code generation based on trellis diagrams , 1992, [Proceedings] 1992 IEEE International Symposium on Circuits and Systems.

[23]  Konstantinos Konstantinides,et al.  Image and Video Compression Standards: Algorithms and Architectures , 1997 .

[24]  Bruce D. Shriver,et al.  Local Microcode Compaction Techniques , 1980, CSUR.

[25]  Hugo De Man,et al.  Data routing: a paradigm for efficient data-path synthesis and code generation , 1994, Proceedings of 7th International Symposium on High-Level Synthesis.

[26]  Daniel P. Lopresti,et al.  SPLASH: A Reconfigurable Linear Logic Array , 1990, ICPP.

[27]  Andrew C Payne,et al.  Are DSP Chips Obsolete , 2002 .

[28]  Harvey F. Silverman,et al.  Processor reconfiguration through instruction-set metamorphosis , 1993, Computer.

[29]  Didier Le Gall,et al.  MPEG: a video compression standard for multimedia applications , 1991, CACM.

[30]  Ralph Wittig,et al.  OneChip: an FPGA processor with reconfigurable logic , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[31]  Alfred V. Aho,et al.  Optimal code generation for expression trees , 1975, STOC.

[32]  Gregory J. Chaitin,et al.  Register allocation and spilling via graph coloring , 2004, SIGP.

[33]  James F. Gimpel,et al.  Covering Points of a Digraph with Point-Disjoint Paths and Its Application to Code Optimization , 1977, JACM.

[34]  Murray Hill,et al.  Yacc: Yet Another Compiler-Compiler , 1978 .

[35]  Susan J. Eggers,et al.  Integrating register allocation and instruction scheduling for RISCs , 1991, ASPLOS IV.

[36]  John L. Hennessy,et al.  The priority-based coloring approach to register allocation , 1990, TOPL.

[37]  Edward A. Lee Programmable dsp architectures: part ii , 1988 .

[38]  Edward A. Lee,et al.  DSP Processor Fundamentals: Architectures and Features , 1997 .

[39]  Sally A. McKee,et al.  Hitting the memory wall: implications of the obvious , 1995, CARN.

[40]  Uri C. Weiser,et al.  Intel MMX for multimedia PCs , 1997, Commun. ACM.

[41]  Jeffrey D. Ullman,et al.  The Generation of Optimal Code for Arithmetic Expressions , 1970, JACM.

[42]  Sharad Malik,et al.  Using register-transfer paths in code generation for heterogeneous memory-register architectures , 1996, DAC '96.

[43]  Richard M. Karp,et al.  Index Register Allocation , 1966, JACM.

[44]  Andrew W. Appel,et al.  Iterated register coalescing , 1996, TOPL.

[45]  Kurt Keutzer,et al.  Storage assignment to decrease code size , 1996, TOPL.

[46]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[47]  Charles Young Hitchcock Addressing modes for fast and optimal code generation (compilers) , 1986 .

[48]  Ravi Sethi,et al.  Efficient computation of expressions with common subexpressions , 1978, POPL '78.

[49]  Andrew W. Appel,et al.  Generalizations of the sethi‐ullman algorithm for register allocation , 1987, Softw. Pract. Exp..

[50]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[51]  Sharad Malik,et al.  Optimization of embedded DSP programs using post-pass data-flow analysis , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[52]  Pierre G. Paulin,et al.  DSP design tool requirements for embedded systems: A telecommunications industrial perspective , 1995, J. VLSI Signal Process..

[53]  Giovanni De Micheli,et al.  Hardware/Software Co-Design: Application Domains and Design Technologies , 1996 .

[54]  J. Bruno,et al.  The Generation of Optimal Code for Stack Machines , 1975, JACM.

[55]  Alfred V. Aho,et al.  Code Generation for Expressions with Common Subexpressions , 1977, J. ACM.