Learning Recognition and Segmentation Using the Cresceptron

This paper presents a framework called Cresceptron for view-based learning, recognition and segmentation. Specifically, it recognizes and segments image patterns that are similar to those learned, using a stochastic distortion model and view-based interpolation, allowing other view points that are moderately different from those used in learning. The learning phase is interactive. The user trains the system using a collection of training images. For each training image, the user manually draws a polygon outlining the region of interest and types in the label of its class. Then, from the directional edges of each of the segmented regions, the Cresceptron uses a hierarchical self-organization scheme to grow a sparsely connected network automatically, adaptively and incrementally during the learning phase. At each level, the system detects new image structures that need to be learned and assigns a new neural plane for each new feature. The network grows by creating new nodes and connections which memorize the new image structures and their context as they are detected. Thus, the structure of the network is a function of the training exemplars. The Cresceptron incorporates both individual learning and class learning; with the former, each training example is treated as a different individual while with the latter, each example is a sample of a class. In the performance phase, segmentation and recognition are tightly coupled. No foreground extraction is necessary, which is achieved by backtracking the response of the network down the hierarchy to the image parts contributing to recognition. Several stochastic shape distortion models are analyzed to show why multilevel matching such as that in the Cresceptron can deal with more general stochastic distortions that a single-level matching scheme cannot. The system is demonstrated using images from broadcast television and other video segments to learn faces and other objects, and then later to locate and to recognize similar, but possibly distorted, views of the same objects.

[1]  C. Shatz The developing brain. , 1992, Scientific American.

[2]  John R. Anderson Cognitive Psychology and Its Implications , 1980 .

[3]  H. Wilson,et al.  Threshold visibility of frequency gradient patterns , 1977, Vision Research.

[4]  Narendra Ahuja,et al.  Learning recognition and segmentation of 3-D objects from 2-D images , 1993, 1993 (4th) International Conference on Computer Vision.

[5]  J. K. Aggarwal,et al.  Automatic generation of recognition strategies using CAD models , 1991, [1991 Proceedings] Workshop on Directions in Automated CAD-Based Vision.

[6]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[7]  Luc Van Gool,et al.  Recognition and semi-differential invariants , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Thomas S. Huang,et al.  Human face detection in a complex background , 1994, Pattern Recognit..

[9]  V S Ramachandran,et al.  Perceiving shape from shading. , 1988, Scientific American.

[10]  P. A. Kolers,et al.  Size in the visual processing of faces and words. , 1985, Journal of experimental psychology. Human perception and performance.

[11]  Avinash C. Kak,et al.  A robot vision system for recognizing 3D objects in low-order polynomial time , 1989, IEEE Trans. Syst. Man Cybern..

[12]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[13]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[14]  P. Rakic Specification of cerebral cortical areas. , 1988, Science.

[15]  Theodosios Pavlidis,et al.  Why progress in machine vision is so slow , 1992, Pattern Recognit. Lett..

[16]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[18]  W. Highleyman Linear Decision Functions, with Application to Pattern Recognition , 1962, Proceedings of the IRE.

[19]  A. L. I︠A︡rbus Eye Movements and Vision , 1967 .

[20]  A. L. Yarbus,et al.  Eye Movements and Vision , 1967, Springer US.

[21]  Takayuki Ito,et al.  Neocognitron: A neural network model for a mechanism of visual pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[22]  M. Hebert,et al.  The Representation, Recognition, and Locating of 3-D Objects , 1986 .

[23]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[24]  Takeo Kanade,et al.  Automatic generation of object recognition programs , 1988, Proc. IEEE.

[25]  D. O. Hebb,et al.  The organization of behavior , 1988 .

[26]  Isaac Weiss,et al.  Geometric invariants and object recognition , 1993, International Journal of Computer 11263on.

[27]  Kunihiko Fukushima,et al.  Cognitron: A self-organizing multilayered neural network , 1975, Biological Cybernetics.

[28]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[29]  Thomas J. Carew Developmental assembly of learning in Aplysia , 1989, Trends in Neurosciences.

[30]  B. Dreher,et al.  Receptive field analysis: responses to moving visual contours by single lateral geniculate neurones in the cat , 1973, The Journal of physiology.

[31]  T. Wiesel,et al.  Functional architecture of macaque monkey visual cortex , 1977 .

[32]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[33]  J. O'Regan,et al.  Some results on translation invariance in the human visual system. , 1990, Spatial vision.

[34]  S. Carey Conceptual Change in Childhood , 1985 .

[35]  P. Thompson,et al.  Margaret Thatcher: A New Illusion , 1980, Perception.

[36]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[37]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[38]  H. Sato,et al.  On finding the ends of straight homogeneous generalized cylinders , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[40]  Thomas C. Henderson,et al.  CAGD-Based Computer Vision , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  A. Treisman The Role of Attention in Object Perception , 1983 .

[42]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[43]  Anil K. Jain,et al.  Evidence-Based Recognition of 3-D Objects , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  P. A. Kolers,et al.  Size in the visual processing of faces and words. , 1985 .

[45]  Narendra Ahuja,et al.  Cresceptron: a self-organizing neural network which grows adaptively , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[46]  Juyang Weng,et al.  Genetic algorithms for object recognition in a complex scene , 1995, Proceedings., International Conference on Image Processing.

[47]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[48]  Gérard G. Medioni,et al.  Structural Indexing: Efficient 3-D Object Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[50]  Takeo Kanade,et al.  Human Face Detection in Visual Scenes , 1995, NIPS.

[51]  R. Kesner,et al.  Learning and memory : a biological view , 1986 .

[52]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[53]  J. Bergen,et al.  A four mechanism model for threshold spatial vision , 1979, Vision Research.

[54]  M. Bichsel Strategies of robust object recognition for the automatic identification of human faces , 1991 .

[55]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[56]  S. Deadwyler,et al.  Long-term potentiation : from biophysics to behavior , 1988 .

[57]  D. Hubel Eye, brain, and vision , 1988 .

[58]  J. Weng Cresceptron and Shoslif: toward Comprehensive Visual Learning 1 , 1996 .

[59]  Daniel G. Keehn,et al.  A note on learning for Gaussian properties , 1965, IEEE Trans. Inf. Theory.

[60]  T. Cover LEARNING IN PATTERN RECOGNITION , 1969 .

[61]  K. J. Muller,et al.  Nerve fiber growth and the cellular response to axotomy. , 1982, Current topics in developmental biology.

[62]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[63]  Rodney A. Brooks,et al.  Symbolic Reasoning Among 3-D Models and 2-D Images , 1981, Artif. Intell..

[64]  Lloyd Guth,et al.  History of central nervous system regeneration research , 1975, Experimental Neurology.

[65]  E. Kandel,et al.  Molecular biology of learning: modulation of transmitter release. , 1982, Science.

[66]  David A. Forsyth,et al.  Invariant Descriptors for 3D Object Recognition and Pose , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[67]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[68]  Theodosios Pavlidis,et al.  Structural pattern recognition , 1977 .

[69]  W. Grimson,et al.  Model-Based Recognition and Localization from Sparse Range or Tactile Data , 1984 .