Evaluation of Histogram of Oriented Gradients Soft Errors Criticality for Automotive Applications

Pedestrian detection reliability is a key problem for autonomous or aided driving, and methods that use Histogram of Oriented Gradients (HOG) are very popular. Embedded Graphics Processing Units (GPUs) are exploited to run HOG in a very efficient manner. Unfortunately, GPUs architecture has been shown to be particularly vulnerable to radiation-induced failures. This article presents an experimental evaluation and analytical study of HOG reliability. We aim at quantifying and qualifying the radiation-induced errors on pedestrian detection applications executed in embedded GPUs. We analyze experimental results obtained executing HOG on embedded GPUs from two different vendors, exposed for about 100 hours to a controlled neutron beam at Los Alamos National Laboratory. We consider the number and position of detected objects as well as precision and recall to discriminate critical erroneous computations. The reported analysis shows that, while being intrinsically resilient (65% to 85% of output errors only slightly impact detection), HOG experienced some particularly critical errors that could result in undetected pedestrians or unnecessary vehicle stops. Additionally, we perform a fault-injection campaign to identify HOG critical procedures. We observe that Resize and Normalize are the most sensitive and critical phases, as about 20% of injections generate an output error that significantly impacts HOG detection. With our insights, we are able to find those limited portions of HOG that, if hardened, are more likely to increase reliability without introducing unnecessary overhead.

[1]  A. Visconti,et al.  Sensitivity of NOR Flash memories to wide-energy spectrum neutrons during accelerated tests , 2014, 2014 IEEE International Reliability Physics Symposium.

[2]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Cristian Constantinescu,et al.  Impact of deep submicron technology on dependability of VLSI circuits , 2002, Proceedings International Conference on Dependable Systems and Networks.

[4]  Xin Fu,et al.  Analyzing soft-error vulnerability on GPGPU microarchitecture , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).

[5]  Rezaur Rahman,et al.  Intel Xeon Phi Coprocessor Architecture and Tools: The Guide for Application Developers , 2013 .

[6]  Sarita V. Adve,et al.  Understanding the propagation of hard errors to software and implications for resilient system design , 2008, ASPLOS.

[7]  S. Pontarelli,et al.  A New Hardware/Software Platform and a New 1/E Neutron Source for Soft Error Studies: Testing FPGAs at the ISIS Facility , 2007, IEEE Transactions on Nuclear Science.

[8]  Ravishankar K. Iyer,et al.  Dynamic Derivation of Application-Specific Error Detectors and their Implementation in Hardware , 2006, 2006 Sixth European Dependable Computing Conference.

[9]  Michiel van Ratingen,et al.  The European New Car Assessment Programme , 2014 .

[10]  Lloyd W. Massengill,et al.  Impact of scaling on soft-error rates in commercial microprocessors , 2002 .

[11]  Luigi Carro,et al.  Threads Distribution Effects on Graphics Processing Units Neutron Sensitivity , 2013, IEEE Transactions on Nuclear Science.

[12]  Michael Nicolaidis Time redundancy based soft-error tolerance to rescue nanometer technologies , 1999, Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146).

[13]  Xiaodong Yang,et al.  Recognizing actions using depth motion maps-based histograms of oriented gradients , 2012, ACM Multimedia.

[14]  R.C. Blish,et al.  Flash memory under cosmic and alpha irradiation , 2004, IEEE Transactions on Device and Materials Reliability.

[15]  G. Gasiot,et al.  Soft errors induced by natural radiation at ground level in floating gate flash memories , 2013, 2013 IEEE International Reliability Physics Symposium (IRPS).

[16]  Sarita V. Adve,et al.  GangES: Gang error simulation for hardware resiliency evaluation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[17]  Deva Ramanan,et al.  Histograms of Sparse Codes for Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Ravishankar K. Iyer,et al.  An experimental study of soft errors in microprocessors , 2005, IEEE Micro.

[19]  Bo Fang,et al.  GPU-Qin: A methodology for evaluating the error resilience of GPGPU applications , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[20]  Young-Chul Lim,et al.  Pedestrian detection using HOG-based block selection , 2014, 2014 11th International Conference on Informatics in Control, Automation and Robotics (ICINCO).

[21]  Lui Sha,et al.  Process resurrection: a fast recovery mechanism for real-time embedded systems , 2005, 11th IEEE Real Time and Embedded Technology and Applications Symposium.

[22]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[23]  Mohan M. Trivedi,et al.  Looking at Vehicles on the Road: A Survey of Vision-Based Vehicle Detection, Tracking, and Behavior Analysis , 2013, IEEE Transactions on Intelligent Transportation Systems.

[24]  Claus Braun,et al.  Efficacy and efficiency of algorithm-based fault-tolerance on GPUs , 2013, 2013 IEEE 19th International On-Line Testing Symposium (IOLTS).

[25]  J-C. Laprie,et al.  DEPENDABLE COMPUTING AND FAULT TOLERANCE : CONCEPTS AND TERMINOLOGY , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[26]  Laura Monroe,et al.  GPU Behavior on a Large HPC Cluster , 2013, Euro-Par Workshops.

[27]  R.C. Baumann,et al.  Radiation-induced soft errors in advanced semiconductor technologies , 2005, IEEE Transactions on Device and Materials Reliability.

[28]  Robyn R. Lutz,et al.  Analyzing software requirements errors in safety-critical, embedded systems , 1993, [1993] Proceedings of the IEEE International Symposium on Requirements Engineering.

[29]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[30]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[31]  Joseph J. Lim,et al.  Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Melvin A. Breuer,et al.  Defect and error tolerance in the presence of massive numbers of defects , 2004, IEEE Design & Test of Computers.

[33]  Xing Li,et al.  A HOG Feature and SVM Based Method for Forward Vehicle Detection with Single Camera , 2013, 2013 5th International Conference on Intelligent Human-Machine Systems and Cybernetics.

[34]  Ravishankar K. Iyer,et al.  An Architectural Framework for Detecting Process Hangs/Crashes , 2005, EDCC.

[35]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[36]  Luigi Carro,et al.  Modern GPUs Radiation Sensitivity Evaluation and Mitigation Through Duplication With Comparison , 2014, IEEE Transactions on Nuclear Science.

[37]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Todd M. Austin,et al.  A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor , 2003, MICRO.

[40]  Yo-Hwan Koh,et al.  A low power and highly reliable 400Mbps mobile DDR SDRAM with on-chip distributed ECC , 2007, 2007 IEEE Asian Solid-State Circuits Conference.

[41]  Laura Schweitzer,et al.  Advances In Kernel Methods Support Vector Learning , 2016 .

[42]  George Candea,et al.  Recursive restartability: turning the reboot sledgehammer into a scalpel , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[43]  M. Lopez-Vallejo,et al.  System Design Framework and Methodology for Xilinx Virtex FPGA Configuration Scrubbers , 2014, IEEE Transactions on Nuclear Science.

[44]  Thiago Santini,et al.  Evaluation and Mitigation of Radiation-Induced Soft Errors in Graphics Processing Units , 2016, IEEE Transactions on Computers.

[45]  Eduardo Pinheiro,et al.  DRAM errors in the wild: a large-scale field study , 2009, SIGMETRICS '09.

[46]  Luigi Carro,et al.  GPGPUs: How to combine high computational power with high reliability , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[47]  Jean-Claude Laprie,et al.  Dependable computing: concepts, limits, challenges , 1995 .