Viewpoint detection models for sequential embodied object category recognition

This paper proposes a method for learning viewpoint detection models for object categories that facilitate sequential object category recognition and viewpoint planning. We have examined such models for several state-of-the-art object detection methods. Our learning procedure has been evaluated using an exhaustive multiview category database recently collected for multiview category recognition research. Our approach has been evaluated on a simulator that is based on real images that have previously been collected. Simulation results verify that our viewpoint planning approach requires fewer viewpoints for confident recognition. Finally, we illustrate the applicability of our method as a component of a completely autonomous visual recognition platform that has previously been demonstrated in an object category recognition competition.

[1]  Cordelia Schmid,et al.  Viewpoint-independent object class detection using 3D Feature Maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Tal Arbel,et al.  Efficient Discriminant Viewpoint Selection for Active Bayesian Recognition , 2006, International Journal of Computer Vision.

[3]  Björn Johansson,et al.  Comparison of local image descriptors for full 6 degree-of-freedom pose estimation , 2009, 2009 IEEE International Conference on Robotics and Automation.

[4]  Andrew Zisserman,et al.  Incremental learning of object detectors using a visual shape alphabet , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  James J. Little,et al.  Curious George: An attentive semantic robot , 2008, Robotics Auton. Syst..

[7]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[8]  Silvio Savarese,et al.  3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[10]  Yiming Ye,et al.  Sensor Planning for 3D Object Search , 1999 .

[11]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Danica Kragic,et al.  Object Search and Localization for an Indoor Mobile Robot , 2009, J. Comput. Inf. Technol..

[13]  Kristen Grauman,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Markus Vincze,et al.  Rethinking Robot Vision – Combining Shape and Appearance , 2007 .

[15]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[16]  Frank P. Ferrie,et al.  Autonomous exploration: driven by uncertainty , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[17]  James J. Little,et al.  Informed visual search: Combining attention and object recognition , 2008, 2008 IEEE International Conference on Robotics and Automation.

[18]  Jana Kosecka,et al.  From sensors to human spatial concepts , 2007, Robotics Auton. Syst..

[19]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[21]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[22]  Gérard G. Medioni,et al.  Robust real-time vision for a personal service robot , 2007, Comput. Vis. Image Underst..

[23]  Luc Van Gool,et al.  Using Multi-view Recognition and Meta-data Annotation to Guide a Robot's Attention , 2009, Int. J. Robotics Res..

[24]  Kristen Grauman,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, CVPR.

[25]  Silvio Savarese,et al.  Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories , 2009, 2009 IEEE 12th International Conference on Computer Vision.