Learning to Recognise Objects and Situations to Control a Robot End-Effector

View based representations have become very popular for recognition tasks. In this contribution, we argue that the potential of the approach is not yet fully tapped: Tasks need not to be “homogeneous”, i.e. there is no need to restrict a system e.g. to either “object classification ” or “gesture recognition”. Instead, qualitatively different problems like gesture recognition and scene evaluation can be handled simultaneously by the same system. This feature makes the view based approach a well suited tool for robotics as will be demonstrated for the domain of an end-effector camera. In the described scenario, the task is threefold: Recognition of object types, judging the stability of grasps on objects and hand gesture classification. As this task leads to a large variety of views, a neural network–based recognition architecture specifically designed to represent very non-linear distributions of samples representing views will be described. 1

[1]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[2]  Gunther Heidemann,et al.  Ein flexibel einsetzbares Objekterkennungssystem auf der Basis neuronaler Netze , 1998, DISKI.

[3]  Helge J. Ritter,et al.  Efficient Vector Quantization Using the WTA-Rule with Activity Equalization , 2004, Neural Processing Letters.

[4]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[5]  Hanspeter A. Mallot,et al.  Neuronale Netze , 2003, Handbuch der Künstlichen Intelligenz.

[6]  G. Hirzinger,et al.  A tactile sensing system for the DLR three-finger robot hand , 1995 .

[7]  Helge J. Ritter,et al.  Visuelle Aufmerksamkeitssteuerung zur Unterstützung gestikbasierter Mensch-Maschine Interaktion , 1999, Künstliche Intell..

[8]  Helge Ritter,et al.  Combining multiple neural nets for visual feature selection and classification , 1999 .

[9]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[10]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[12]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[13]  Ján Jockusch,et al.  Exploration based on neural networks with applications in manipulator control , 2000 .

[14]  Helge J. Ritter,et al.  Guiding attention for grasping tasks by gestural instruction: the GRAVIS-robot architecture , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[15]  Helge J. Ritter,et al.  Combining gestural and contact information for visual guidance of multi-finger grasps , 2002, ESANN.