论文信息 - Introspective perception: Learning to predict failures in vision systems

Introspective perception: Learning to predict failures in vision systems

As robots aspire for long-term autonomous operations in complex dynamic environments, the ability to reliably take mission-critical decisions in ambiguous situations becomes critical. This motivates the need to build systems that have situational awareness to assess how qualified they are at that moment to make a decision. We call this self-evaluating capability as introspection. In this paper, we take a small step in this direction and propose a generic framework for introspective behavior in perception systems. Our goal is to learn a model to reliably predict failures in a given system, with respect to a task, directly from input sensor data. We present this in the context of vision-based autonomous MAV flight in outdoor natural environments, and show that it effectively handles uncertain situations.

[1] R. C. Coulter,et al. Implementation of the Pure Pursuit Path Tracking Algorithm , 1992 .

[2] Robert P. W. Duin,et al. Classifier Conditional Posterior Probabilities , 1998, SSPR/SPR.

[3] Padraig Cunningham,et al. Generating Estimates of Classification Confidence for a Case-Based Spam Filter , 2005, ICCBR.

[4] Alonzo Kelly,et al. Toward Reliable Off Road Autonomous Vehicles Operating in Challenging Environments , 2006, Int. J. Robotics Res..

[5] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[6] Alonzo Kelly,et al. Optimal Sampling In the Space of Paths: Preliminary Results , 2006 .

[7] Bruno Mirbach,et al. Confidence Estimation in Classification Decision: A Method for Detecting Unseen Patterns , 2006 .

[8] Yann LeCun,et al. Large-scale Learning with SVM and Convolutional for Generic Object Categorization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9] William Whittaker,et al. Robotic introspection for exploration and mapping of subterranean environments , 2007 .

[10] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.

[11] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[12] Pietro Perona,et al. Entropy-based active learning for object recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[13] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14] Trevor Darrell,et al. Gaussian Processes for Object Categorization , 2010, International Journal of Computer Vision.

[15] Horst Bischof,et al. Motion estimation with non-local total variation regularization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.

[17] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[18] Yaser Al-Onaizan,et al. Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[19] Shang-Hua Teng,et al. Power SVM: Generalization with exemplar classification uncertainty , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[22] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23] C. V. Jawahar,et al. Has My Algorithm Succeeded? An Evaluator for Human Pose Estimators , 2012, ECCV.

[24] Martial Hebert,et al. Learning monocular reactive UAV control in cluttered natural environments , 2012, 2013 IEEE International Conference on Robotics and Automation.

[25] Yichuan Tang,et al. Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[26] Youmin Zhang,et al. Development of advanced FDD and FTC techniques with application to an unmanned quadrotor helicopter testbed , 2013, J. Frankl. Inst..

[27] Rudolph Triebel,et al. Knowing when we don't know: Introspective classification for mission-critical decision making , 2013, 2013 IEEE International Conference on Robotics and Automation.

[28] Dong Liu,et al. Discovering joint audio–visual codewords for video event detection , 2013, Machine Vision and Applications.

[29] Stefan Carlsson,et al. Properties of Datasets Predict the Performance of Classifiers , 2013, BMVC.

[30] Horst Bischof,et al. Flexible and User-Centric Camera Calibration using Planar Fiducial Markers , 2013, BMVC.

[31] Pietro Perona,et al. A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Daniel Cremers,et al. Semi-dense Visual Odometry for a Monocular Camera , 2013, 2013 IEEE International Conference on Computer Vision.

[33] Rudolph Triebel,et al. Driven Learning for Driving: How Introspection Improves Semantic Mapping , 2016, ISRR.

[34] Sinan Kalkan,et al. Deep Hierarchies in the Primate Visual Cortex: What Can We Learn for Computer Vision? , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] Rob Fergus,et al. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[36] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[37] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[38] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[39] Ali Farhadi,et al. Towards Transparent Systems: Semantic Characterization of Failure Modes , 2014, ECCV.

[40] Ali Farhadi,et al. Predicting Failures of Vision Systems , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41] Daniel Cremers,et al. LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[42] Fingerprint Image Quality , 2015, Encyclopedia of Biometrics.

[43] Eric Horvitz,et al. Metareasoning for Planning Under Uncertainty , 2015, IJCAI.

[44] Martial Hebert,et al. Vision and Learning for Deliberative Monocular Cluttered Flight , 2014, FSR.

[45] Martial Hebert,et al. Semi-Dense Visual Odometry for Monocular Navigation in Clutt ered Environment , 2015 .

[46] Rudolph Triebel,et al. Introspective classification for robot perception , 2016, Int. J. Robotics Res..

[47] Martial Hebert,et al. Robust Monocular Flight in Cluttered Outdoor Environments , 2016, ArXiv.