Simultaneous place and object recognition with mobile robot using pose encoded contextual information

Place and object recognition are two fundamental problems for mobile robot to understand its surroundings. In the field of computer vision it has been acknowledged that context plays an important role in image parsing, but in most of the researches contextual information is only used in one direction and little attention is paid to the relative pose context between objects and local features. We observe, however, place and object can serve as context to each other, that is the recognition of one facilitates the recognition of the other. In this paper, a new hierarchical random field which can encode multiple kinds of context including co-occurrence context, temporal context and relative pose context is proposed for simultaneous place and object recognition with a mobile platform. And a new kind of relative pose context, which is scale and rotation invariant, is defined to improve the stability of pose-encoded context. Experimental results with a mobile robot prove that the proposed method significantly improve the precision of the place and object recognition in familiar and unfamiliar environments.

[1]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[2]  Serge J. Belongie,et al.  Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Andrew McCallum,et al.  Piecewise Training for Undirected Models , 2005, UAI.

[4]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[6]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[8]  N. H. C. Yung,et al.  Scene categorization via contextual visual words , 2010, Pattern Recognit..

[9]  In-So Kweon,et al.  Simultaneous place and object recognition using collaborative context information , 2009, Image Vis. Comput..

[10]  Ramakant Nevatia,et al.  Key Object Driven Multi-category Object Recognition, Localization and Tracking Using Spatio-temporal Context , 2008, ECCV.

[11]  Emilio Maggio,et al.  Learning Scene Context for Multiple Object Tracking , 2009, IEEE Transactions on Image Processing.

[12]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Sergio Escalera,et al.  Contextual-Guided Bag-of-Visual-Words Model for Multi-class Object Categorization , 2009, CAIP.

[14]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[15]  Paul Newman,et al.  A generative framework for fast urban labeling using spatial and temporal context , 2009, Auton. Robots.

[16]  Jiebo Luo,et al.  Probabilistic spatial context models for scene content understanding , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  James J. Little,et al.  Automated Place Classification Using Object Detection , 2010, 2010 Canadian Conference on Computer and Robot Vision.

[18]  Antonio Torralba,et al.  Describing Visual Scenes Using Transformed Objects and Parts , 2008, International Journal of Computer Vision.

[19]  Tao Mei,et al.  Contextual Bag-of-Words for Visual Categorization , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Qiang Ji,et al.  Spatio-Temporal Context for Robust Multitarget Tracking , 2007 .

[21]  Daphne Koller,et al.  Learning Spatial Context: Using Stuff to Find Things , 2008, ECCV.