Active multi-view object recognition: A unifying view on online feature selection and view planning

Many robots are limited in their operating capabilities, both computational and energy-wise. A strong desire exists to keep computation cost and energy consumption to a minimum when executing tasks like object recognition with a mobile robot. Adaptive action selection is a paradigm, offering great flexibility in trading off the cost of acquiring information against making robust and reliable inference under uncertainty. In this paper, we study active multi-view object recognition and describe an information-theoretic framework that combines and unifies two common techniques: online feature selection for reducing computational costs and view planning for resolving ambiguities and occlusions. Our algorithm adaptively chooses between the two strategies of either selecting only the features that are most informative to the recognition, or moving to a new viewpoint that optimally reduces the expected uncertainty on the identity of the object. This two step process allows us to keep overall computation cost minimal but simultaneously increase recognition accuracy. Extensive empirical studies on a large RGB-D dataset, and with two different feature sets, have validated the effectiveness of the proposed framework. Our experiments show that dynamic feature selection alone reduces the computation time at runtime 2.5-6 times and, when combining it with viewpoint selection, we significantly increase the recognition accuracy on average by 8%-18% absolute compared to systems that do not use these two strategies. By establishing a link between active object recognition and change detection, we were further able to use our framework for the follow-up task of actively detecting object change. Furthermore, we have successfully demonstrated the framework's applicability to a low-powered quadcopter platform with limited operating time. A unified approach to feature and viewpoint selection for multi-view object recognition is proposed.Online feature selection reduces the dimensionality and with that the computation time.View planning offers performance advantages whenever multiple views are required due to ambiguous situations or occlusions.Increased recognition accuracy and reduced computation cost are realized by an information-theoretic action selection framework.

[1]  Dieter Fox,et al.  Autonomous generation of complete 3D object models using next best view manipulation planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[2]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Matthew A. Brown,et al.  Unsupervised 3D object recognition and reconstruction in unordered datasets , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[6]  Gaurav S. Sukhatme,et al.  Using Manipulation Primitives for Object Sorting in Cluttered Environments , 2015, IEEE Transactions on Automation Science and Engineering.

[7]  Geoffrey A. Hollinger,et al.  Active Classification: Theory and Application to Underwater Inspection , 2011, ISRR.

[8]  Lucas Paletta,et al.  Active Object Recognition in Parametric Eigenspace , 1998, BMVC.

[9]  Geoffrey A. Hollinger,et al.  Active planning for underwater inspection and the benefit of adaptivity , 2012, Int. J. Robotics Res..

[10]  Jean-Claude Latombe,et al.  Planning Robot Motions for Range-Image Acquisition and Automatic 3D Model Construction , 1998 .

[11]  Zoltan-Csaba Marton,et al.  Evaluation of feature selection and model training strategies for object category recognition , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Dieter Fox,et al.  Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[13]  Wolfram Burgard,et al.  Information Gain-based Exploration Using Rao-Blackwellized Particle Filters , 2005, Robotics: Science and Systems.

[14]  George J. Pappas,et al.  Nonmyopic View Planning for Active Object Classification and Pose Estimation , 2014, IEEE Transactions on Robotics.

[15]  Dieter Fox,et al.  Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Kai Oliver Arras,et al.  People tracking in RGB-D data with on-line boosted target models , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Konstantinos A. Tarabanis,et al.  Computing Occlusion-Free Viewpoints , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Horst Bischof,et al.  On-line Boosting and Vision , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Zoltan-Csaba Marton,et al.  Ensembles of strong learners for multi-cue classification , 2013, Pattern Recognit. Lett..

[20]  Lucas Paletta,et al.  Active object recognition by view integration and reinforcement learning , 2000, Robotics Auton. Syst..

[21]  Tal Arbel,et al.  Efficient Discriminant Viewpoint Selection for Active Bayesian Recognition , 2006, International Journal of Computer Vision.

[22]  John J. Leonard,et al.  Toward lifelong object segmentation from change detection in dense RGB-D maps , 2013, 2013 European Conference on Mobile Robots.

[23]  Danica Kragic,et al.  Integrating Active Mobile Robot Object Recognition and SLAM in Natural Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[25]  Dieter Fox,et al.  Kernel Descriptors for Visual Recognition , 2010, NIPS.

[26]  Jun Li,et al.  Active Recognition and Manipulation for Mobile Robot Bin Picking , 2014, Technology Transfer Experiments from the ECHORD Project.

[27]  Gary R. Bradski,et al.  Fast 3D recognition and pose using the Viewpoint Feature Histogram , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Tamim Asfour,et al.  Autonomous acquisition of visual multi-view object representations for object recognition on a humanoid robot , 2010, 2010 IEEE International Conference on Robotics and Automation.

[29]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[30]  Lambert Schomaker,et al.  Ensemble Methods for Robust 3D Face Recognition Using Commodity Depth Sensors , 2015, 2015 IEEE Symposium Series on Computational Intelligence.

[31]  Gaurav S. Sukhatme,et al.  Active Multi-view Object Recognition and Online Feature Selection , 2015, ISRR.

[32]  R. Bajcsy Active perception , 1988 .

[33]  Dorin Comaniciu,et al.  Conditional feature sensitivity: a unifying view on active recognition and feature selection , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[35]  Joachim Denzler,et al.  Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Siddhartha S. Srinivasa,et al.  Efficient multi-view object recognition and full pose estimation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[37]  Gaurav S. Sukhatme,et al.  A probabilistic framework for next best view estimation in a cluttered environment , 2014, J. Vis. Commun. Image Represent..