Per-frame mAP Prediction for Continuous Performance Monitoring of Object Detection During Deployment

Performance monitoring of object detection is crucial for safety-critical applications such as autonomous vehicles that operate under varying and complex environmental conditions. Currently, object detectors are evaluated using summary metrics based on a single dataset that is assumed to be representative of all future deployment conditions. In practice, this assumption does not hold, and the performance fluctuates as a function of the deployment conditions. To address this issue, we propose an introspection approach to performance monitoring during deployment without the need for ground truth data. We do so by predicting when the per-frame mean average precision drops below a critical threshold using the detector’s internal features. We quantitatively evaluate and demonstrate our method’s ability to reduce risk by trading off making an incorrect decision by raising the alarm and absenting from detection.

[1]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[2]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Sanja Fidler,et al.  Learning to Evaluate Perception Models Using Planner-Centric Metrics , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Niko Sünderhauf,et al.  Dropout Sampling for Robust Object Detection in Open-Set Conditions , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[6]  Graham W. Taylor,et al.  Leveraging Uncertainty Estimates for Predicting Segmentation Quality , 2018, ArXiv.

[7]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[8]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Min Sun,et al.  Efficient Uncertainty Estimation for Semantic Segmentation in Videos , 2018, ECCV.

[10]  Kaiming He,et al.  Rethinking ImageNet Pre-Training , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Martial Hebert,et al.  Introspective perception: Learning to predict failures in vision systems , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[13]  C. V. Jawahar,et al.  Has My Algorithm Succeeded? An Evaluator for Human Pose Estimators , 2012, ECCV.

[14]  Nuno Vasconcelos,et al.  Towards Realistic Predictors , 2018, ECCV.

[15]  Maya R. Gupta,et al.  To Trust Or Not To Trust A Classifier , 2018, NeurIPS.

[16]  Rudolph Triebel,et al.  Driven Learning for Driving: How Introspection Improves Semantic Mapping , 2016, ISRR.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Xiangyu Zhang,et al.  DetNet: A Backbone network for Object Detection , 2018, ArXiv.

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  George Kantor,et al.  Introspective Evaluation of Perception Performance for Parameter Tuning without Ground Truth , 2017, Robotics: Science and Systems.

[21]  Yanming Guo,et al.  Delving into Fully Convolutional Networks Activations for Visual Recognition , 2018, ICMIP 2018.

[22]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[25]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Niko Sünderhauf,et al.  Did You Miss the Sign? A False Negative Alarm System for Traffic Sign Detectors , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[28]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[29]  Rudolph Triebel,et al.  Introspective classification for robot perception , 2016, Int. J. Robotics Res..

[30]  Dushyant Rao,et al.  Learn from experience: Probabilistic prediction of perception performance to avoid failure , 2018, Int. J. Robotics Res..

[31]  Sanjeev Khudanpur,et al.  Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.

[32]  Jinjun Xiong,et al.  Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection , 2018, ArXiv.

[33]  William J. Dally,et al.  A Delay Metric for Video Object Detection: What Average Precision Fails to Tell , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Joydeep Biswas,et al.  IVOA: Introspective Vision for Obstacle Avoidance , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Matthew Johnson-Roberson,et al.  Failing to Learn: Autonomously Identifying Perception Failures for Self-Driving Cars , 2017, IEEE Robotics and Automation Letters.

[38]  Matthieu Cord,et al.  Addressing Failure Prediction by Learning Model Confidence , 2019, NeurIPS.

[39]  William Whittaker,et al.  Robotic introspection for exploration and mapping of subterranean environments , 2007 .

[40]  Ali Farhadi,et al.  Predicting Failures of Vision Systems , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[43]  Zhangyang Wang,et al.  Practical Solutions for Machine Learning Safety in Autonomous Vehicles , 2019, SafeAI@AAAI.