Software thread integration for embedded system display applications

Embedded systems require control of many concurrent real-time activities, leading to system designs that feature a variety of hardware peripherals, with each providing a specific, dedicated service. These peripherals increase system size, cost, weight, and design time. Software thread integration (STI) provides low-cost thread concurrency on general-purpose processors by automatically interleaving multiple threads of control into one. This simplifies hardware to software migration (which eliminates dedicated hardware) and can help embedded system designers meet design constraints, such as size, weight and cost. We have developed concepts for performing STI and have implemented many in our automated postpass compiler Thrint. Here we present the transformations and examine how well the compiler integrates threads for two display applications. We examine the integration procedure, the processor load, and code memory expansion. Integration allows reclamation of CPU idle time, allowing run-time speedups of 1.6x to 3.6x.

[1]  Hidehiko Tanaka,et al.  Multiple threads in cyclic register windows , 1993, ISCA '93.

[2]  Apostolos A. Kountouris,et al.  Efficient scheduling of conditional behaviors for high-level synthesis , 2002, TODE.

[3]  Rajiv Gupta,et al.  Busy-idle profiles and compact task graphs: compile-time support for interleaved and overlapped scheduling of real-time tasks , 1994, 1994 Proceedings Real-Time Systems Symposium.

[4]  M. Frans KaashoekMIT Using Software-extended Architectures for Software Simultaneous Multithreading , 1997 .

[5]  Mauricio J. Serrano,et al.  Performance estimation of multistreamed, superscalar processors , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[6]  Edward A. Lee,et al.  Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.

[7]  Alexander G. Dean,et al.  Extending STI for demanding hard-real-time systems , 2003, CASES '03.

[8]  Gregory J. Chaitin,et al.  Register allocation & spilling via graph coloring , 1982, SIGPLAN '82.

[9]  Youn-Long Lin,et al.  Recent developments in high-level synthesis , 1997, TODE.

[10]  Qunyan Wu Register Allocation via Hierarchical Graph Coloring , 1996 .

[11]  John Cocke,et al.  Register Allocation Via Coloring , 1981, Comput. Lang..

[12]  Stephen A. Edwards,et al.  An Esterel compiler for large control-dominated systems , 2002, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  Alexandru Nicolau,et al.  Trailblazing: A Hierarchical Approach to Percolation Scheduling , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[14]  Douglas Niehaus Program representation and translation for predictable real-time systems , 1991, [1991] Proceedings Twelfth Real-Time Systems Symposium.

[15]  Carl J. Beckmann,et al.  Hardware and software for functional and fine grain parallelism , 1993 .

[16]  Jack Bresenham,et al.  Algorithm for computer control of a digital plotter , 1965, IBM Syst. J..

[17]  Niraj K. Jha,et al.  COWLS: hardware-software cosynthesis of wireless low-power distributed embedded client-server systems , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[18]  Luciano Lavagno,et al.  Quasi-static scheduling of independent tasks for reactive systems , 2002, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[19]  J. Janardhan,et al.  Enhanced region scheduling on a program dependence graph , 1992, MICRO 25.

[20]  Alexander G. Dean,et al.  Compiling for fine-grain concurrency: planning and performing software thread integration , 2002, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002..

[21]  Chris J. Newburn,et al.  EXPLOITING MULTI-GRAINED PARALLELISM FOR MULTIPLE-INSTRUCTION-STREAM ARCHITECTURES , 1997 .

[22]  Raul Camposano,et al.  Path-based scheduling for synthesis , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[23]  Alexander Dean,et al.  A High-Temperature Embedded Network Interface using Software Thread Integration , 1999 .

[24]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1984, TOPL.

[25]  Gérard Berry,et al.  The Esterel Synchronous Programming Language: Design, Semantics, Implementation , 1992, Sci. Comput. Program..

[26]  Alexander G. Dean,et al.  Software thread integration for hardware to software migration , 2000 .

[27]  G DeanAlexander Software thread integration for embedded system display applications , 2006 .

[28]  David E. Culler,et al.  Fine-grain parallelism with minimal hardware support: a compiler-controlled threaded abstract machine , 1991, ASPLOS IV.

[29]  Nikil D. Dutt,et al.  SPARK: a high-level synthesis framework for applying parallelizing compiler transformations , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[30]  Thierry Gautier,et al.  Programming real-time applications with SIGNAL , 1991, Proc. IEEE.

[31]  Gregory J. Chaitin,et al.  Register allocation and spilling via graph coloring , 2004, SIGP.

[32]  Vicki H. Allan,et al.  Enhanced region scheduling on a program dependence graph , 1992, MICRO 1992.

[33]  Mario Nemirovsky,et al.  DISC: dynamic instruction stream computer , 1991, MICRO 24.

[34]  van Jtj Jos Eijndhoven,et al.  Combining code motion and scheduling , 1996 .

[35]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[36]  John Cocke,et al.  A methodology for the real world , 1981 .

[37]  John Paul Shen,et al.  Hardware to software migration with real-time thread integration , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).

[38]  John Paul Shen,et al.  A PDG-based Tool and its Use in Analyzing Program Control Dependences , 1994, IFIP PACT.

[39]  Krithi Ramamritham,et al.  Allocation and scheduling of complex periodic tasks , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[40]  Florence Maraninchi,et al.  The Argos Language: Graphical Representation of Automata and Description of Reactive Systems , 2007 .

[41]  Nikil D. Dutt,et al.  Using global code motions to improve the quality of results for high-level synthesis , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[42]  John Paul Shen,et al.  Techniques for software thread integration in real-time embedded systems , 1998, Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No.98CB36279).

[43]  John Paul Shen,et al.  System-level issues for software thread integration: guest triggering and host selection , 1999, Proceedings 20th IEEE Real-Time Systems Symposium (Cat. No.99CB37054).

[44]  Taewhan Kim,et al.  A scheduling algorithm for conditional resource sharing , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[45]  Pascal Raymond,et al.  The synchronous data flow programming language LUSTRE , 1991, Proc. IEEE.

[46]  Howard Bierman,et al.  Color television;: Principles and servicing , 1973 .

[47]  Susan J. Eggers,et al.  The effectiveness of multiple hardware contexts , 1994, ASPLOS VI.

[48]  Paul Le Guernic,et al.  Implementation of the data-flow synchronous language SIGNAL , 1995, PLDI '95.

[49]  Kazutoshi Wakabayashi,et al.  A resource sharing and control synthesis method for conditional branches , 1989, 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[50]  Rajiv Gupta,et al.  Timed Perturbation Analysis: An Approach for Non-Intrusive Monitoring of Real-Time Computations , 2007 .

[51]  Bill Lin,et al.  Efficient compilation of process-based concurrent programs without run-time scheduling , 1998, Proceedings Design, Automation and Test in Europe.

[52]  William E. Weihl,et al.  Register relocation: flexible contexts for multithreading , 1993, ISCA '93.

[53]  Giovanni De Micheli,et al.  Hardware-software cosynthesis for digital systems , 1993, IEEE Design & Test of Computers.