Combining Multiple Views and Temporal Associations for 3-D object Recognition

This article describes an architecture for the recognition of three-dimensional objects on the basis of viewer centred representations and temporal associations. Considering evidence from psychophysics, neurophysiology, as well as computer science we have decided to use a viewer centred approach for the representation of three-dimensional objects. Even though this concept quite naturally suggests utilizing the temporal order of the views for learning and recognition, this aspect is often neglected. Therefore we will pay special attention to the evaluation of the temporal information and embed it into the conceptual framework of biological findings and computational advantages. The proposed recognition system consists of four stages and includes different kinds of artificial neural networks: Preprocessing is done by a Gabor-based wavelet transform. A Dynamic Link Matching algorithm, extended by several modifications, forms the second stage. It implements recognition and learning of the view classes. The temporal order of the views is recorded by a STORE network which transforms the output for a presented sequence of views into an item- and-order coding. A subsequent Gaussian-ARTMAP architecture is used for the classification of the sequences and for their mapping onto object classes by means of supervised learning. The results achieved with this system show its capability to autonomously learn and to recognize considerably similar objects. Furthermore the given examples illustrate the benefits for object recognition stemming from the utilization of the temporal context. Ambiguous views become manageable and a higher degree of robustness against misclassifications can be accomplished.

[1]  Y. Miyashita Neuronal correlate of visual associative long-term memory in the primate temporal cortex , 1988, Nature.

[2]  Richard J. Mammone,et al.  Artificial neural networks for speech and vision , 1994 .

[3]  Alex Pentland,et al.  Recognition of Space-Time Gestures using a Distributed Representation , 1993 .

[4]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[5]  J. Leo van Hemmen,et al.  Temporal association , 1991 .

[6]  Stephen Grossberg,et al.  Fast Learning VIEWNET Architectures for Recognizing 3-D Objects from Multiple 2-D Views , 1995 .

[7]  Wolfgang Konen,et al.  A fast dynamic link matching algorithm for invariant pattern recognition , 1994, Neural Networks.

[8]  Stephen Grossberg,et al.  Working Memory Networks for Learning Temporal Order with Application to Three-Dimensional Visual Object Recognition , 1992, Neural Computation.

[9]  Y. Miyashita,et al.  Neural organization for the long-term memory of paired associates , 1991, Nature.

[10]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  S. Sumi Upside-down Presentation of the Johansson Moving Light-Spot Pattern , 1984, Perception.

[12]  Guy Wallis,et al.  Temporal Order in Human Object Recognition Learning , 1998 .

[13]  Bernadette Bouchon-Meunier,et al.  Fuzzy Logic And Soft Computing , 1995 .

[14]  D. W. Thompson,et al.  Three-dimensional model matching from an unconstrained viewpoint , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[15]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[16]  Irving Biederman,et al.  Human image understanding: Recent research and a theory , 1985, Comput. Vis. Graph. Image Process..

[17]  Stephen Grossberg,et al.  VIEWNET ARCHITECTURES FOR INVARIANT 3-D OBJECT LEARNING AND RECOGNITION FROM MULTIPLE 2-D VIEWS , 1995 .

[18]  R. Shepard,et al.  Transformational studies of the internal representation of three-dimensional objects. , 1974 .

[19]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[20]  M. Stryker Temporal associations , 1991, Nature.

[21]  Rolf P. Würtz,et al.  Multilayer dynamic link networks for establishing image point correspondences and visual object recognition , 1995 .

[22]  Daphna Weinshall,et al.  A Self-Organizing Multiple-View Representation of Three- Dimensional Objects , 1989 .

[23]  Allen M. Waxman,et al.  Adaptive 3-D Object Recognition from Multiple Views , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Christoph von der Malsburg,et al.  The Correlation Theory of Brain Function , 1994 .

[25]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[26]  James R. Williamson,et al.  Gaussian ARTMAP: A Neural Network for Fast Incremental Learning of Noisy Multidimensional Maps , 1996, Neural Networks.

[27]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[28]  R. L. Solso Theories in cognitive psychology : the Loyola symposium , 1975 .

[29]  Laurenz Wiskott,et al.  Face recognition by dynamic link matching , 1996 .