Active exploration and keypoint clustering for object recognition

Object recognition is a challenging problem for artificial systems. This is especially true for objects that are placed in cluttered and uncontrolled environments. To challenge this problem, we discuss an active approach to object recognition. Instead of passively observing objects, we use a robot to actively explore the objects. This enables the system to learn objects from different viewpoints and to actively select viewpoints for optimal recognition. Active vision furthermore simplifies the segmentation of the object from its background. As the basis for object recognition we use the Scale Invariant Feature Transform (SIFT). SIFT has been a successful method for image representation. However, a known drawback of SIFT is that the computational complexity of the algorithm increases with the number of keypoints. We discuss a growing-when-required (GWR) network for efficient clustering of the key- points. The results show successful learning of 3D objects in real-world environments. The active approach is successful in separating the object from its cluttered background, and the active selection of viewpoint further increases the performance. Moreover, the GWR-network strongly reduces the number of keypoints.

[1]  Subhashis Banerjee,et al.  Isolated 3D object recognition through next view planning , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[2]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[4]  Pietro Perona,et al.  Evaluation of Features Detectors and Descriptors based on 3D Objects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Luc Van Gool,et al.  Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views , 2006, International Journal of Computer Vision.

[6]  Stefano Nolfi,et al.  Adaptation as a more powerful tool than decomposition and integration: experimental evidences from evolutionary robotics , 1998, 1998 IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36228).

[7]  Giorgio Metta,et al.  Better Vision through Manipulation , 2003, Adapt. Behav..

[8]  Subhashis Banerjee,et al.  Active recognition through next view planning: a survey , 2004, Pattern Recognit..

[9]  Lucas Paletta,et al.  Active object recognition by view integration and reinforcement learning , 2000, Robotics Auton. Syst..

[10]  Giorgio Metta,et al.  Early integration of vision and manipulation , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[11]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  E. Reed The Ecological Approach to Visual Perception , 1989 .

[13]  Dana H. Ballard,et al.  Animate Vision , 1991, Artif. Intell..

[14]  B Fritzke,et al.  A growing neural gas network learns topologies. G. Tesauro, DS Touretzky, and TK Leen, editors , 1995, NIPS 1995.

[15]  Paul M. Fitzpatrick,et al.  First contact: an active vision approach to segmentation , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[16]  Pietro Perona,et al.  Evaluation of Features Detectors and Descriptors Based on 3D Objects , 2005, ICCV.

[17]  Giovanni M. Bianco,et al.  The turn-back-and-look behaviour: bee versus robot , 2000, Biological Cybernetics.

[18]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[19]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[20]  David G. Lowe,et al.  Local feature view clustering for 3D object recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[21]  Cordelia Schmid,et al.  3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints , 2006, International Journal of Computer Vision.

[22]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[23]  Stephen R. Marsland,et al.  A self-organising network that grows when required , 2002, Neural Networks.

[24]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[25]  Lucas Paletta,et al.  Appearance-based active object recognition , 2000, Image Vis. Comput..

[26]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[27]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..