A Compile-Time Optimization Method for WCET Reduction in Real-Time Embedded Systems through Block Formation

Compile-time optimizations play an important role in the efficient design of real-time embedded systems. Usually, compile-time optimizations are designed to reduce average-case execution time (ACET). While ACET is a main concern in high-performance computing systems, in real-time embedded systems, concerns are different and worst-case execution time (WCET) is much more important than ACET. Therefore, WCET reduction is more desirable than ACET reduction in many real-time embedded systems. In this article, we propose a compile-time optimization method aimed at reducing WCET in real-time embedded systems. In the proposed method, based on the predicated execution capability of embedded processors, program code blocks that are in the worst-case paths of the program are merged to increase instruction-level parallelism and opportunity for WCET reduction. The use of predicated execution enables merging code blocks from different worst-case paths that can be very effective in WCET reduction. The experimental results show that the proposed method can reduce WCET by up to 45% as compared to previous compile-time block formation methods. It is noteworthy that compared to previous works, while the proposed method usually achieves more WCET reduction, it has considerably less negative impact on ACET and code size.

[1]  Wei Zhang,et al.  A time-predictable VLIW processor and its compiler support , 2007, Real-Time Systems.

[2]  Rainer Leupers,et al.  Exploiting conditional instructions in code generation for embedded VLIW processors , 1999, Design, Automation and Test in Europe Conference and Exhibition, 1999. Proceedings (Cat. No. PR00078).

[3]  Henrik Theiling,et al.  Design of a WCET-Aware C Compiler , 2006, 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia.

[4]  Paul Lokuciejewski,et al.  Worst-Case Execution Time Aware Compilation Techniques for Real-Time Systems , 2010 .

[5]  Heiko Falk,et al.  WCET-aware register allocation based on graph coloring , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[6]  Xianfeng Li,et al.  Modeling out-of-order processors for WCET analysis , 2006, Real-Time Systems.

[7]  William C. Kreahling,et al.  Improving WCET by optimizing worst-case paths , 2005, 11th IEEE Real Time and Embedded Technology and Applications Symposium.

[8]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[9]  Heiko Falk,et al.  Optimal static WCET-aware scratchpad allocation of program code , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[10]  Katharina Morik,et al.  Automatic WCET Reduction by Machine Learning Based Heuristics for Function Inlining , 2013 .

[11]  Benedikt Huber,et al.  T-CREST: Time-predictable multi-core architecture for embedded systems , 2015, J. Syst. Archit..

[12]  Chun Jason Xue,et al.  WCET-Aware Re-Scheduling Register Allocation for Real-Time Embedded Systems With Clustered VLIW Architecture , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Scott A. Mahlke,et al.  Trimaran: An Infrastructure for Research in Instruction-Level Parallelism , 2004, LCPC.

[14]  Hui Wu,et al.  Optimal WCET-aware code selection for scratchpad memory , 2010, EMSOFT '10.

[15]  Xianfeng Li,et al.  Chronos: A timing analyzer for embedded software , 2007, Sci. Comput. Program..

[16]  Jakob Engblom,et al.  Pipeline timing analysis using a trace-driven simulator , 1999, Proceedings Sixth International Conference on Real-Time Computing Systems and Applications. RTCSA'99 (Cat. No.PR00306).

[17]  Niraj K. Jha,et al.  Fault-tolerant computer system design , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[18]  Keith D. Cooper,et al.  Engineering a Compiler , 2003 .

[19]  Neil C. Audsley,et al.  Time-Predictable Out-of-Order Execution for Hard Real-Time Systems , 2010, IEEE Transactions on Computers.

[20]  David B. Whalley,et al.  Improving WCET by applying a WC code-positioning optimization , 2005, TACO.

[21]  Scott A. Mahlke,et al.  Using profile information to assist classic code optimizations , 1991, Softw. Pract. Exp..

[22]  Paul Lokuciejewski,et al.  WCET-driven Cache-based Procedure Positioning Optimizations , 2008, 2008 Euromicro Conference on Real-Time Systems.

[23]  Ting Chen,et al.  WCET centric data allocation to scratchpad memory , 2005, 26th IEEE International Real-Time Systems Symposium (RTSS'05).

[24]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[25]  Peter Puschner Is Worst-Case Execution-Time Analysis a Non-Problem? — Towards New Software and Hardware Architectures , 2002 .

[26]  David Seal,et al.  ARM Architecture Reference Manual , 2001 .

[27]  Raimund Kirner,et al.  Single-path programming on a chip-multiprocessor system , 2009 .

[28]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[29]  Martin Schoeberl,et al.  Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach , 2011, PPES.

[30]  Rami G. Melhem,et al.  Power-aware scheduling for periodic real-time tasks , 2004, IEEE Transactions on Computers.

[31]  Sharad Malik,et al.  Performance Analysis of Embedded Software Using Implicit Path Enumeration , 1995, 32nd Design Automation Conference.

[32]  Jorg Henkel,et al.  Designing Embedded Processors A Low Power Perspective , 2011 .

[33]  Heiko Falk,et al.  WCET-aware Register Allocation Based on Integer-Linear Programming , 2011, 2011 23rd Euromicro Conference on Real-Time Systems.

[34]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[35]  Henrik Theiling,et al.  WCET-driven, code-size critical procedure cloning , 2008, SCOPES '08.

[36]  Kent Wilken,et al.  Optimal instruction scheduling using integer programming , 2000, PLDI.

[37]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[38]  Paul Lokuciejewski,et al.  Superblock-Based Source Code Optimizations for WCET Reduction , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[39]  Richard M. Stallman,et al.  Using the GNU Compiler Collection , 2010 .

[40]  Sri Parameswaran,et al.  Designing Embedded Processors , 2007 .

[41]  Heiko Falk,et al.  Loop Nest Splitting for WCET-Optimization and Predictability Improvement , 2006, 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia.

[42]  Paul Lokuciejewski,et al.  Combining Worst-Case Timing Models, Loop Unrolling, and Static Loop Analysis for WCET Minimization , 2009, 2009 21st Euromicro Conference on Real-Time Systems.

[43]  Paolo Faraboschi,et al.  Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .