Pose Estimation Errors, the Ultimate Diagnosis

This paper proposes a thorough diagnosis for the problem of object detection and pose estimation. We provide a diagnostic tool to examine the impact in the performance of the different types of false positives, and the effects of the main object characteristics. We focus our study on the PASCAL 3D+ dataset, developing a complete diagnosis of four different state-of-the-art approaches, which span from hand-crafted models, to deep learning solutions. We show that gaining a clear understanding of typical failure cases and the effects of object characteristics on the performance of the models, is fundamental in order to facilitate further progress towards more accurate solutions for this challenging task.

[1]  Xiaofeng Ren,et al.  Discriminative Mixture-of-Templates for Viewpoint Classification , 2010, ECCV.

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Silvio Savarese,et al.  Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[4]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[5]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Jitendra Malik,et al.  Viewpoints and keypoints , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  P. Fua,et al.  Pose estimation for category specific multiview object localization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Cordelia Schmid,et al.  Multi-view object class detection with a 3D geometric model , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Silvio Savarese,et al.  3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Silvio Savarese,et al.  Deformable part models revisited: A performance evaluation for object category pose estimation , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[11]  Leonidas J. Guibas,et al.  Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Jörn Ostermann,et al.  Embedding Geometry in Generative Models for Pose Estimation of Object Categories , 2014, BMVC.

[13]  Luc Van Gool,et al.  Scalable multi-class object detection , 2011, CVPR 2011.

[14]  Charless C. Fowlkes,et al.  Multiresolution Models for Object Detection , 2010, ECCV.

[15]  Tinne Tuytelaars,et al.  Is 2D Information Enough For Viewpoint Estimation? , 2014, BMVC.

[16]  Pavel Zemcík,et al.  Real-Time Pose Estimation Piggybacked on Object Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Derek Hoiem,et al.  Diagnosing Error in Object Detectors , 2012, ECCV.

[18]  Bernt Schiele,et al.  What Is Holding Back Convnets for Detection? , 2015, GCPR.

[19]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Peter V. Gehler,et al.  Teaching 3D geometry to deformable part models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Jitendra Malik,et al.  Object detection using a max-margin Hough transform , 2009, CVPR.

[22]  Luc De Raedt,et al.  Allocentric Pose Estimation , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Ronen Basri,et al.  Viewpoint-aware object detection and continuous pose estimation , 2012, Image Vis. Comput..

[24]  Silvio Savarese,et al.  A multi-view probabilistic model for 3D object classes , 2009, CVPR.

[25]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[26]  Luc Van Gool,et al.  Towards Multi-View Object Class Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[28]  Peter V. Gehler,et al.  3D object class detection in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Luc Van Gool,et al.  Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Alexei A. Efros,et al.  An empirical study of context in object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Roberto Javier López-Sastre,et al.  Because better detections are still possible: Multi-aspect Object Detection with Boosted Hough Forest , 2015, BMVC.

[33]  Bernt Schiele,et al.  Detailed 3D Representations for Object Recognition and Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.