Supervised learning based model for predicting variability-induced timing errors

Circuit designers typically combat variations in hardware and workload by increasing conservative guardbanding that leads to operational inefficiency. Reducing this excessive guardband is highly desirable, but causes timing errors in synchronous circuits. We propose a methodology for supervised learning based models to predict timing errors at bit-level. We show that a logistic regression based model can effectively predict timing errors, for a given amount of guardband reduction. The proposed methodology enables a model-based rule method to reduce guardband subject to a required bit-level reliability specification. For predicting timing errors at bit-level, the proposed model generation automatically uses a binary classifier per output bit that captures the circuit path sensitization. We train and test our model on gate-level simulations with timing error information extracted from an ASIC flow that considers physical details of placed-and-routed single-precision pipelined floating-point units (FPUs) in 45nm TSMC technology. We further assess the robustness of our modeling methodology by considering various operating voltage and temperature corners. Our model predicts timing errors with an average accuracy of 95% for unseen input workload. This accuracy can be used to achieve a 0%-15% guardband reduction for FPUs, while satisfying the reliability specification for four error-tolerant applications.

[1]  Luis Ceze,et al.  Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[2]  Lizhong Zheng,et al.  Unequal Error Protection: An Information-Theoretic Perspective , 2008, IEEE Transactions on Information Theory.

[3]  Martin C. Rinard,et al.  Chisel: reliability- and accuracy-aware optimization of approximate computational kernels , 2014, OOPSLA.

[4]  Luca Benini,et al.  A variability-aware OpenMP environment for efficient execution of accuracy-configurable computation on shared-FPU processor clusters , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[5]  James Tschanz,et al.  Parameter variations and impact on circuits and microarchitecture , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[6]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[7]  Quinn Jacobson,et al.  ERSA: error resilient system architecture for probabilistic applications , 2010, DATE 2010.

[8]  Luca Benini,et al.  Hierarchically Focused Guardbanding: An adaptive approach to mitigate PVT variations and aging , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9]  Martin C. Rinard,et al.  Verifying quantitative reliability for programs that execute on unreliable hardware , 2013, OOPSLA.

[10]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..