Object Recognition with a Sparse and Autonomously Learned Representation Based on Banana Wavelets

We introduce an object recognition system, based on the well known Elastic Graph Matching (EGM), but includes signiicant improvements compared to earlier versions. Our basic features are banana wavelets, which are generalized Gabor wavelets. In addition to the qualities frequency and orientation, banana wavelets have the attributes curvature and size. Banana wavelets can be metrically organized. A sparse and eecient representation of object classes is learned utilizing this metric organization. Learning is guided by a sensible amount of a priori knowledge in form of basic principles. The learned representation is used for a fast matching. Signiicant speed up can be achieved by hierarchical processing of features. Furthermore manual construction of ground truth is replaced by an automatic generation of suitable training examples using motor controlled feedback. We motivate the biological plausibility of our approach by utilizing concepts like hierarchical processing or metrical organization of features inspired by brain research and criticize a too detailed modelling of biological processing.

[1]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[2]  V. Bruce,et al.  A comparison of two computer-based face identification systems with human perceptions of faces , 1998, Vision Research.

[3]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[4]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[5]  K Tanaka,et al.  Neuronal mechanisms of object recognition. , 1993, Science.

[6]  David I. Perrett,et al.  Modeling visual recognition from neurobiological constraints , 1994, Neural Networks.

[7]  C. von der Malsburg,et al.  Improving object recognition by transforming Gabor filter responses. , 1996, Network.

[8]  Norbert Krüger,et al.  Determination of face position and pose with a learned representation based on labelled graphs , 1997, Image Vis. Comput..

[9]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[10]  Christoph von der Malsburg,et al.  Tracking and learning graphs and pose on image sequences of faces , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[11]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[12]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[13]  Aaron F. Bobick,et al.  Closed-world tracking , 1995, Proceedings of IEEE International Conference on Computer Vision.

[14]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[15]  Josef Pauli Learning Operators for View Independent Object Recognition , 1996, BMVC.

[16]  John G. Daugman,et al.  Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression , 1988, IEEE Trans. Acoust. Speech Signal Process..

[17]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[18]  S. Zucker,et al.  Endstopped neurons in the visual cortex as a substrate for calculating curvature , 1987, Nature.

[19]  Jochen Triesch,et al.  Robust classification of hand postures against complex backgrounds , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[20]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[21]  J. Austin Associative memory , 1987 .

[22]  Norbert Krüger,et al.  Learning Weights in Discrimination Functions Using a priori Constraints , 1995, DAGM-Symposium.

[23]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[24]  Daniel L. Swets,et al.  SHOSLIF-O: SHOSLIF for Object Recognition and Image Retrieval (Phase II) , 1995 .

[25]  Norbert Krüger,et al.  Face Recognition and Gender determination , 1995 .

[26]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[27]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.