Untestable faults identification in GPGPUs for safety-critical applications

Nowadays, General Purpose Graphics Processing Units (GPGPUs) devices are considered as promising solutions for high-performance safety-critical applications, such as those in the automotive field. However, their adoption requires solutions to effectively detect faults arising in the device during the operative life. Hence, effective in-field test solutions are required to guarantee high-reliability levels. In this paper, we leverage the results of Software-Based Self-Test (SBST) based approaches for GPGPUs by deploying new techniques for automating the identification of untestable faults (UF). Our methodology has achieved fault coverage of 82.8% when applied to an open-source implementation of the NVIDIA G80 GPU architecture. The proposed approach combining SBSTs and UFs identification appears as an effective solution for the reliability analysis of GPGPUs.

[1]  Russell Tessier,et al.  FlexGrip: A soft GPGPU for FPGAs , 2013, 2013 International Conference on Field-Programmable Technology (FPT).

[2]  Raimund Ubar,et al.  New categories of Safe Faults in a processor-based Embedded System , 2019, 2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS).

[3]  Hans-Leo Ross Functional Safety for Road Vehicles: New Challenges and Solutions for E-mobility and Automated Driving , 2016 .

[4]  Matteo Sonza Reorda,et al.  About on-line functionally untestable fault identification in microprocessor cores for safety-critical applications , 2018, 2018 IEEE 19th Latin-American Test Symposium (LATS).

[5]  Matteo Sonza Reorda,et al.  About the functional test of the GPGPU scheduler , 2018, 2018 IEEE 24th International Symposium on On-Line Testing And Robust System Design (IOLTS).

[6]  Michael Nicolaidis,et al.  Reliability challenges of real-time systems in forthcoming technology nodes , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[7]  Matteo Sonza Reorda,et al.  Testing permanent faults in pipeline registers of GPGPUs: A multi-kernel approach , 2019, 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS).

[8]  Xin Li,et al.  Algorithm and hardware implementation for visual perception system in autonomous vehicle: A survey , 2017, Integr..

[9]  Matteo Sonza Reorda,et al.  An extended model to support detailed GPGPU reliability analysis , 2019, 2019 14th International Conference on Design & Technology of Integrated Systems In Nanoscale Era (DTIS).