Probabilistic timing analysis of time-randomised caches with fault detection mechanisms

In the real-time systems domain, time-randomised caches have been proposed as a way to simplify software timing analysis, i.e. the process of estimating the probabilistic worst case execution time (pWCET) of an application. However, the technology scaling of the cache memory manufacturing process is rendering transient and permanent faults more and more likely. These faults, in turn, affect a system's timing behaviour and the complexity of its analysis. In this study, the authors propose a static probabilistic timing analysis approach for time-randomised caches that is able to account for the presence of faults – and their detection mechanisms – using a state-space modelling technique. Their experiments show that the proposed methodology is capable of providing tight pWCET estimates. In their analysis, the effects on the estimation of safe pWCET bounds of two online mechanisms for the detection and classification of faults, i.e. a rule-based system and dynamic hidden Markov models (D-HMMs), are compared. The experimental results show that different mechanisms can greatly affect safe pWCET margins and that, by using D-HMMs, the pWCET of the system can be improved with respect to rule-based detection.