Tracing Flow Information for Tighter WCET Estimation: Application to Vectorization

Real-time systems have become ubiquitous, and many play an important role in our everyday life. For hard real-time systems, computing correct results is not the only requirement. In addition, these results must be produced within pre-determined deadlines. Designers must compute the worst-case execution times (WCET) of the tasks composing the system, and guarantee that they meet the required timing constraints. Standard static WCET estimation techniques establish a WCET bound from an analysis of the machine code, taking into account additional flow information provided at source code level, either by the programmer or from static code analysis. Precise flow information helps produce tighter WCET bounds, hence limiting over-provisioning the system. However, flow information is difficult to maintain consistent through the optimizations applied by a compiler, and the majority of real-time systems simply do not apply any optimization. Vectorization is a powerful optimization that exploits data-level parallelism present in many applications, using the SIMD (single instruction multiple data) extensions of processor instruction sets. Vectorization is a mature optimization, and it is key to the performance of many systems. Unfortunately, it strongly impacts the control flow structure of functions and loops, and makes it more difficult to trace flow information from high-level down to machine code. For this reason, as many other optimizations, it is overlooked in real-time systems. In this paper, we propose a method to trace and maintain flow information from source code to machine code when vectorization optimization is applied. WCET estimation can benefit from this traceability. We implemented our approach in the LLVM compiler. In addition, we show through measurements on single-path programs that vectorization improves not only average-case performance but also WCETs. The WCET improvement ratio ranges from 1.18x to 1.41x depending on the target architecture on a benchmark suite designed for vectorizing compilers (TSVC).

[1]  Armelle Bonenfant,et al.  FFX: a portable WCET annotation language , 2012, RTNS '12.

[2]  Jakob Engblom,et al.  Facilitating worst-case execution times analysis for optimized code , 1998, Proceeding. 10th EUROMICRO Workshop on Real-Time Systems (Cat. No.98EX168).

[3]  Adrian Prantl,et al.  Source-Level Support for Timing Analysis , 2010, ISoLA.

[4]  Isabelle Puaut,et al.  Traceability of Flow Information: Reconciling Compiler Optimizations and WCET Estimation , 2014, RTNS.

[5]  D. Naishlos,et al.  Autovectorization in GCC , 2004 .

[6]  Raimund Kirner,et al.  Transforming flow information during code optimization for timing analysis , 2010, Real-Time Systems.

[7]  Saman P. Amarasinghe,et al.  Exploiting superword level parallelism with multimedia instruction sets , 2000, PLDI '00.

[8]  Milo M. K. Martin,et al.  Pitfalls of Accurately Benchmarking Thermally Adaptive Chips , 2014 .

[9]  Pascal Raymond,et al.  Timing analysis enhancement for synchronous program , 2013, RTNS '13.

[10]  Erven Rohou,et al.  Vectorization technology to improve interpreter performance , 2013, TACO.

[11]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[12]  Markus Schordan Source-To-Source Analysis with SATIrE - an Example Revisited , 2008, Scalable Program Analysis.

[13]  Jack J. Dongarra,et al.  Vectorizing compilers: a test suite and results , 1988, Proceedings. SUPERCOMPUTING '88.

[14]  Daniel Prokesch,et al.  Combined WCET analysis of bitcode and machine code using control-flow relation graphs , 2013, LCTES '13.

[15]  Pascal Sainrat,et al.  OTAWA: An Open Toolbox for Adaptive WCET Analysis , 2010, SEUS.

[16]  Sharad Malik,et al.  Performance Analysis of Embedded Software Using Implicit Path Enumeration , 1995, 32nd Design Automation Conference.

[17]  Isabelle Puaut,et al.  Worst Case Execution Time Analysis for a Processor with Branch Prediction , 2004, Real-Time Systems.

[18]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[19]  David A. Padua,et al.  An Evaluation of Vectorizing Compilers , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.