Relaxed Softmax: Efficient Confidence Auto-Calibration for Safe Pedestrian Detection

As machine learning moves from the lab into the real world, reliability is often of paramount importance. The clearest example are safety-critical applications such as pedestrian detection in autonomous driving. Since algorithms can never be expected to be perfect in all cases, managing reliability becomes crucial. To this end, in this paper we investigate the problem of learning in an end-to-end manner object detectors that are accurate while providing an unbiased estimate of the reliablity of their own predictions. We do so by proposing a modification of the standard softmax layer where a probabilistic confidence score is explicitly pre-multiplied into the incoming activations to modulate confidence. We adopt a rigorous assessment protocol based on reliability diagrams to evaluate the quality of the resulting calibration and show excellent results in pedestrian detection on two challenging public benchmarks.

[1]  Michelle Karg,et al.  NightOwls: A Pedestrians at Night Dataset , 2018, ACCV.

[2]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ali Farhadi,et al.  Towards Transparent Systems: Semantic Characterization of Failure Modes , 2014, ECCV.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Roberto Cipolla,et al.  Modelling uncertainty in deep learning for camera relocalization , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[8]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[9]  B. Schiele,et al.  How Far are We from Solving Pedestrian Detection? , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Abhishek Dutta,et al.  Predicting Face Recognition Performance Using Image Quality , 2015, ArXiv.

[11]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[12]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[13]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[14]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[15]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Andrea Vedaldi,et al.  Learning 3D Object Categories by Looking Around Them , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[19]  Milos Hauskrecht,et al.  Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.

[20]  Xiaoming Liu,et al.  Illuminating Pedestrians via Simultaneous Detection and Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[24]  Bernt Schiele,et al.  CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ali Farhadi,et al.  Predicting Failures of Vision Systems , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Stephen E. Fienberg,et al.  The Comparison and Evaluation of Forecasters. , 1983 .

[27]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[28]  C. V. Jawahar,et al.  Has My Algorithm Succeeded? An Evaluator for Human Pose Estimators , 2012, ECCV.

[29]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[30]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.