Counter-Based Approaches for Efficient WCET Analysis of Multicore Processors with Shared Caches

To enable hard real-time systems to take advantage of multicore processors, it is crucial to obtain the worst-case execution time (WCET) for programs running on multicore processors. However, this is challenging and complicated due to the inter-thread interferences from the shared resources in a multicore processor. Recent research used the combined cache conflict graph (CCCG) to model and compute the worst-case inter-thread interferences on a shared L2 cache in a multicore processor, which is called the CCCG-based approach in this paper. Although it can compute the WCET safely and accurately, its computational complexity is exponential and prohibitive for a large number of cores. In this paper, we propose three counter-based approaches to significantly reduce the complexity of the multicore WCET analysis, while achieving absolute safety with tightness close to the CCCG-based approach. The basic counter-based approach simply counts the worst-case number of cache line blocks mapped to a cache set of a shared L2 cache from all the concurrent threads, and compares it with the associativity of the cache set to compute the worst-case cache behavior. The enhanced counter-based approach uses techniques to enhance the accuracy of calculating the counters. The hybrid counter-based approach combines the enhanced counter-based approach and the CCCG-based approach to further improve the tightness of analysis without significantly increasing the complexity. Our experiments on a 4-core processor indicate that the enhanced counter-based approach overestimates the WCET by 14% on average compared to the CCCG-based approach, while its averaged running time is less than 1/380 that of the CCCG-based approach. The hybrid approach reduces the overestimation to only 2.65%, while its running time is less than 1/150 that of the CCCG-based approach on average.

[1]  Wang Yi,et al.  Combining Abstract Interpretation with Model Checking for Timing Analysis of Multicore Software , 2010, 2010 31st IEEE Real-Time Systems Symposium.

[2]  Wei Zhang,et al.  Static Timing Analysis of Shared Caches for Multicore Processors , 2012, J. Comput. Sci. Eng..

[3]  Yaning Liu,et al.  Anomaly Detection in Medical Wireless Sensor Networks , 2013, J. Comput. Sci. Eng..

[4]  Sharad Malik,et al.  Performance estimation of embedded software with instruction cache modeling , 1995, ICCAD.

[5]  Henrik Theiling,et al.  Combining abstract interpretation and ILP for microarchitecture modelling and program path analysis , 1998, Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No.98CB36279).

[6]  Jan Gustafsson,et al.  The Mälardalen WCET Benchmarks: Past, Present And Future , 2010, WCET.

[7]  Yun Liang,et al.  Timing analysis of concurrent programs running on shared cache multi-cores , 2009, 2009 30th IEEE Real-Time Systems Symposium.

[8]  Wei Zhang,et al.  WCET Analysis for Multi-Core Processors with Shared L2 Instruction Caches , 2008, 2008 IEEE Real-Time and Embedded Technology and Applications Symposium.

[9]  Gerard J. M. Smit,et al.  A mathematical approach towards hardware design , 2010, Dynamically Reconfigurable Architectures.

[10]  Wei Zhang,et al.  Accurately Estimating Worst-Case Execution Time for Multi-core Processors with Shared Direct-Mapped Instruction Caches , 2009, 2009 15th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.

[11]  David B. Whalley,et al.  Bounding worst-case instruction cache performance , 1994, 1994 Proceedings Real-Time Systems Symposium.

[12]  Francisco J. Cazorla,et al.  Hardware support for WCET analysis of hard real-time multicore systems , 2009, ISCA '09.

[13]  Jan Gustafsson,et al.  Automatic Derivation of Loop Bounds and Infeasible Paths for WCET Analysis Using Abstract Execution , 2006, 2006 27th IEEE International Real-Time Systems Symposium (RTSS'06).

[14]  David B. Whalley,et al.  Bounding Pipeline and Instruction Cache Performance , 1999, IEEE Trans. Computers.

[15]  Wei Zhang,et al.  Bounding Worst-Case Performance for Multi-Core Processors with Shared L2 Instruction Caches , 2011, J. Comput. Sci. Eng..