Simulation and weights of multiple cues for robust object recognition

Reliable recognition of objects is an important capability in order to have agents accomplish and assist in a variety of useful tasks such as search and rescue or office assistance. Numerous approaches attempt to recognize objects based on visual cues alone. However, the same type of object can have very different visual appearances, such as shape, size, pose, color. Although such approaches are widely studied with relative success, the general task of object recognition still remains difficult. In previous work, we introduced MCOR (multiple-cue object recognition), a flexible object recognition approach which can use any multiple cues, whether they are visual cues intrinsic to the object or provided by observation of a human. As part of the framework, weights were provided to reflect the variation in the strength of the association between a particular cue and an object. In this paper, we demonstrate how the probabilistic relational framework used to determine the weights can be used in complex scenarios with numerous objects, cues, and the relationship between them. We develop a simulator that can generate these complex scenarios using cues based on real recognition systems.

[1]  Manuela M. Veloso,et al.  Towards using multiple cues for robust object recognition , 2007, AAMAS '07.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[4]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Andrey Temko,et al.  Classification of meeting-room acoustic events with support vector machines and variable-feature-set clustering , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6]  Alexander I. Rudnicky,et al.  Survey of current speech technology , 1994, CACM.

[7]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[8]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[9]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[10]  Manuela M. Veloso,et al.  FOCUS: a generalized method for object discovery for robots that observe and interact with humans , 2006, HRI '06.

[11]  Jochen Triesch,et al.  Shared Features for Scalable Appearance-Based Object Recognition , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[12]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Eric Horvitz,et al.  S-SEER: Selective Perception in a Multimodal Office Activity Recognition System , 2004, MLMI.

[14]  Irfan A. Essa,et al.  Exploiting human actions and object context for recognition tasks , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.