Embedded software in real-time signal processing systems: design technologies

The increasing use of embedded software, often implemented on a core processor in a single-chip system, is a clear trend in the telecommunications, multimedia, and consumer electronics industries. A companion paper (Paulin et al., 1997) presents a survey of application and architecture trends for embedded systems in these growth markets. However, the lack of suitable design technology remains a significant obstacle in the development of such systems. One of the key requirements is more efficient software compilation technology. Especially in the case of fixed-point digital signal processor (DSP) cores, it is often cited that commercially available compilers are unable to take full advantage of the architectural features of the processor. Moreover, due to the shorter lifetimes and the architectural specialization of many processor cores, processor designers are often compelled to neglect the issue of compiler support. This situation has resulted in an increased research activity in the area of design tool support for embedded processors. This paper discusses design technology issues for embedded systems using processor cores, with a focus on software compilation tools. Architectural characteristics of contemporary processor cores are reviewed and tool requirements are formulated. This is followed by a comprehensive survey of both existing and new software compilation techniques that are considered important in the context of embedded processors.

[1]  Christopher W. Fraser,et al.  Engineering a simple, efficient code-generator generator , 1992, LOPL.

[2]  John R. Ellis,et al.  Bulldog: A Compiler for VLIW Architectures , 1986 .

[3]  Rainer Leupers,et al.  Instruction selection for embedded DSPs with complex instructions , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[4]  R. Preston Gurd Experience developing microcode using a high level language , 1983, SIGM.

[5]  Bruce D. Shriver,et al.  Local Microcode Compaction Techniques , 1980, CSUR.

[6]  Susan L. Graham,et al.  An experiment in table driven code generation , 1982, SIGPLAN '82.

[7]  Rainer Leupers,et al.  Algorithms for address assignment in DSP code generation , 1996, Proceedings of International Conference on Computer Aided Design.

[8]  Sharad Malik,et al.  Memory bank and register allocation in software synthesis for ASIPs , 1995, Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[9]  Francis Depuydt,et al.  Register Optimization and Scheduling for Real-Time Digital Signal Processing Architectures , 1993 .

[10]  Alice C. Parker,et al.  The high-level synthesis of digital systems , 1990, Proc. IEEE.

[11]  Clifford Liem,et al.  Trends In Embedded Systems Technology , 1996 .

[12]  Susan L. Graham,et al.  A new method for compiler code generation , 1978, POPL '78.

[13]  J. Janardhan,et al.  Enhanced region scheduling on a program dependence graph , 1992, MICRO 25.

[14]  Alex Van Someren,et al.  The Arm Risc Chip: A Programmer's Guide , 1994 .

[15]  Hugo De Man,et al.  Modelling hardware-specific data-types for simulation and compilation in HW/SW co-design , 1996 .

[16]  E.A. Lee Programmable DSP architectures. II , 1989, IEEE ASSP Magazine.

[17]  B. Ramakrishna Rau,et al.  Efficient code generation for horizontal architectures: Compiler techniques and architectural support , 1982, ISCA '82.

[18]  Sharad Malik,et al.  Memory bank and register allocation in software synthesis for ASIPs , 1995, ICCAD.

[19]  D. H. Bartley,et al.  Optimizing stack frame accesses for processors with restricted addressing modes , 1992, Softw. Pract. Exp..

[20]  Jian Wang,et al.  A software pipelining based VLIW architecture and optimizing compiler , 1990, [1990] Proceedings of the 23rd Annual Workshop and Symposium@m_MICRO 23: Microprogramming and Microarchitecture.

[21]  Hugo De Man,et al.  A graph based processor model for retargetable code generation , 1996, Proceedings ED&TC European Design and Test Conference.

[22]  Markus Freericks,et al.  Describing instruction set processors using nML , 1995, Proceedings the European Design and Test Conference. ED&TC 1995.

[23]  Mario Barbacci,et al.  Instruction set processor specifications (ISPS): The notation and its applications , 1981, IEEE Transactions on Computers.

[24]  Jeffrey D. Ullman,et al.  The Generation of Optimal Code for Arithmetic Expressions , 1970, JACM.

[25]  Charles N. Fischer,et al.  Affix grammar driven code generation , 1985, TOPL.

[26]  Gary William Grewal,et al.  An integrated approach to retargetable code generation , 1994, Proceedings of 7th International Symposium on High-Level Synthesis.

[27]  T. C. May,et al.  Instruction-set matching and selection for DSP and ASIP code generation , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[28]  Peter M. Kogge,et al.  The Architecture of Pipelined Computers , 1981 .

[29]  Mike Johnson,et al.  Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.

[30]  B. Wess Automatic instruction code generation based on trellis diagrams , 1992, [Proceedings] 1992 IEEE International Symposium on Circuits and Systems.

[31]  Carlos Valderrama,et al.  Trends in embedded systems technology: an industrial perspective , 1995 .

[32]  Alfred V. Aho,et al.  Code generation using tree matching and dynamic programming , 1989, ACM Trans. Program. Lang. Syst..

[33]  Hugo De Man,et al.  Data routing: a paradigm for efficient data-path synthesis and code generation , 1994, Proceedings of 7th International Symposium on High-Level Synthesis.

[34]  Christopher W. Fraser,et al.  The Design and Application of a Retargetable Peephole Optimizer , 1980, TOPL.

[35]  Christian Piguet,et al.  Microprocessor design , 1997 .

[36]  Rainer Leupers,et al.  Time-constrained code compaction for DSPs , 1995 .

[37]  John Cocke,et al.  Register allocation via graph coloring , 1981 .

[38]  Stefan M. Freudenberger,et al.  Phase Ordering of Register Allocation and Instruction Scheduling , 1991, Code Generation.

[39]  Alexandru Nicolau,et al.  Percolation Scheduling: A Parallel Compilation Technique , 1985 .

[40]  Eduardo Pelegrí-Llopart,et al.  Optimal code generation for expression trees: an application BURS theory , 1988, POPL '88.

[41]  Helmut Emmelmann,et al.  BEG: a generator for efficient back ends , 1989, PLDI '89.

[42]  Toshio Nakatani,et al.  A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture , 1990 .

[43]  G. Goossens,et al.  PROGRAMMABLE CHIPS IN CONSUMER ELECTRONICS AND TELECOMMUNICATIONS , 1996 .

[44]  Allen Newell,et al.  Computer Structures: Readings and Examples, , 1971 .

[45]  Heinrich Meyr,et al.  Code Generation and Optimization Techniques for Embedded Digital Signal Processors , 1996 .

[46]  Emile H. L. Aarts,et al.  Architecture and programming of a VLIW style programmable video signal processor , 1991, MICRO 24.

[47]  Christopher W. Fraser,et al.  Code selection through object code optimization , 1984, TOPL.

[48]  Gert Goossens,et al.  Embedded software in real-time signal processing systems: application and architecture trends , 1997 .

[49]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[50]  Richard M. Stallman,et al.  Using and Porting GNU CC , 1998 .

[51]  Hugo Krawczyk,et al.  Code duplication: an assist for global instruction scheduling , 1991, MICRO 24.

[52]  Shlomit S. Pinter,et al.  Register allocation with instruction scheduling: a new approach , 1996, Journal of Programming Languages.

[53]  Peter Marwedel,et al.  Verification of Hardware Descriptions by Retargetable Code Generation , 1989, 26th ACM/IEEE Design Automation Conference.

[54]  Bernhard Wess Code generation based on trellis diagrams , 1994, Code Generation for Embedded Processors.

[55]  Sharad Malik,et al.  Optimal code generation for embedded memory non-homogeneous register architectures , 1995 .

[56]  Andreas Krall,et al.  Dependence-Conscious Global Register Allocation , 1994, Programming Languages and System Architectures.

[57]  D. Bursky Tuned RISC devices deliver top performance , 1997 .

[58]  Rainer Leupers,et al.  A BDD-based frontend for retargetable compilers , 1995, Proceedings the European Design and Test Conference. ED&TC 1995.

[59]  Paul Hilfinger,et al.  A Compiler for Application-Specific Signal Processors , 1989 .

[60]  Qunyan Wu Register Allocation via Hierarchical Graph Coloring , 1996 .

[61]  Christopher W. Fraser,et al.  BURG: fast optimal instruction selection and tree parsing , 1992, SIGP.

[62]  Alfred V. Aho,et al.  Optimal Code Generation for Expression Trees , 1976, J. ACM.

[63]  Alfred V. Aho,et al.  Optimal code generation for expression trees , 1975, STOC.

[64]  Gert Goossens,et al.  Chess: retargetable code generation for embedded DSP processors , 1994, Code Generation for Embedded Processors.

[65]  Peter Marwedel Tree-based mapping of algorithms to predefined structures , 1993, ICCAD.

[66]  Alfred V. Aho,et al.  Code Generation for Expressions with Common Subexpressions , 1977, J. ACM.

[67]  Todd A. Proebsting Simple and efficient BURS table generation , 1992, PLDI '92.

[68]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[69]  Bruce D. Shriver,et al.  Some Experiments in Local Microcode Compaction for Horizontal Machines , 1981, IEEE Transactions on Computers.

[70]  Kurt Keutzer,et al.  Storage assignment to decrease code size , 1996, TOPL.

[71]  Andreas Fauth Beyond tool-specific machine descriptions , 1994, Code Generation for Embedded Processors.

[72]  Pierre G. Paulin,et al.  Flexware: A flexible firmware development environment for embedded systems , 1994, Code Generation for Embedded Processors.

[73]  L. Bergher,et al.  MPEG audio decoder for consumer applications , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.

[74]  John L. Hennessy,et al.  The priority-based coloring approach to register allocation , 1990, TOPL.

[75]  Stan Y. Liao,et al.  Code generation and optimization for embedded digital signal processors , 1996 .

[76]  Jian Wang,et al.  GURPR—a method for global software pipelining , 1987, MICRO 20.

[77]  Joos Vandewalle,et al.  Loop Optimization in Register-Transfer Scheduling for DSP-Systems , 1989, 26th ACM/IEEE Design Automation Conference.

[78]  R. Hartmann,et al.  Combined scheduling and data routing for programmable ASIC systems , 1992, [1992] Proceedings The European Conference on Design Automation.

[79]  Shlomit S. Pinter,et al.  Register allocation with instruction scheduling , 1993, PLDI '93.

[80]  J.T.J. van Eijndhoven,et al.  A data flow graph exchange standard , 1992, [1992] Proceedings The European Conference on Design Automation.

[81]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[82]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[83]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[84]  Ahmed Amine Jerraya,et al.  Address calculation for retargetable compilation and exploration of instruction-set architectures , 1996, DAC '96.

[85]  John L. Bruno,et al.  Code Generation for a One-Register Machine , 1976, J. ACM.

[86]  J. Praet,et al.  Programmable Chips in Consumer Electronics and Telecommunications Architectures and Design Technology , 1996 .

[87]  Steven R. Vegdahl Phase coupling and constant generation in an optimizing microcode compiler , 1982, MICRO 15.

[88]  Rainer Leupers,et al.  Time-constrained code compaction for DSPs , 1997, IEEE Trans. Very Large Scale Integr. Syst..

[89]  Clifford Liem,et al.  Industrial experience using rule-driven retargetable code generation for multimedia applications , 1995 .

[90]  Gregory J. Chaitin,et al.  Register allocation and spilling via graph coloring , 2004, SIGP.

[91]  Lehrstuhl Informatik Xii The MIMOLA Language Version 4.1 , 1994 .