Object Recognition: Theories

The goal of object recognition is to determine the identity or category of an object in a visual scene from the retinal input. In naturalistic scenes, object recognition is a computational challenge because the object may appear in various poses and contexts—i.e., in arbitrary positions, orientations, and distances with respect to the viewer and to other objects. Object recognition involves matching representations of objects stored in memory to representations extracted from the visual image. The key issue in object recognition is the nature of the representation extracted from the image. Theories of object recognition are characterized in terms of five logically independent dimensions: the primitive features or parts extracted from the visual image, the stability of the set of features to transformations of the image, the type of relationships used to describe configurations of features, and the stability of configurations across transformations of the image. The two main classes of object recognition theories—structural description and view-based theories—are characterized with respect to these dimensions, and examples of theories of each class are presented. The strengths and weaknesses of each class is discussed, and the argument is made that the two classes are in fact complementary, not antagonistic.

[1]  S. Edelman,et al.  Computational Theories of Object Recognition Edelman -computation and Object Recognition Ii Box 1. Structural Descriptions ~ 7~ Recognition by Components Varieties of Alignment Multidimensional Histograms Approximation in Feature Space , 2022 .

[2]  J. Hummel,et al.  An architecture for rapid, hierarchical structural description , 1996 .

[3]  A. Woodward Infants selectively encode the goal object of an actor's reach , 1998, Cognition.

[4]  David I. Perrett,et al.  Issues of representation in object vision , 1994 .

[5]  D I Perrett,et al.  Visual Recognition Based on Temporal Cortex Cells: Viewer-Centred Processing of Pattern Configuration , 1998, Zeitschrift fur Naturforschung. C, Journal of biosciences.

[6]  D H Hubel,et al.  Brain mechanisms of vision. , 1979, Scientific American.

[7]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[8]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[9]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[10]  Kunihiko Fukushima,et al.  Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position , 1982, Pattern Recognit..

[11]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[12]  S. Ullman Aligning pictorial descriptions: An approach to object recognition , 1989, Cognition.

[13]  S. Pinker,et al.  Visual cognition : An introduction * , 1989 .

[14]  I. Biederman,et al.  Dynamic binding in a neural network for shape recognition. , 1992, Psychological review.

[15]  Geoffrey E. Hinton A Parallel Computation that Assigns Canonical Object-Based Frames of Reference , 1981, IJCAI.

[16]  I. Biederman,et al.  Viewpoint-dependent mechanisms in visual object recognition: Reply to Tarr and Bülthoff (1995). , 1995 .

[17]  J. Hummel Reference Frames and Relations in Computational Models of Object Recognition , 1994 .

[18]  M J Tarr,et al.  Is human object recognition better described by geon structural descriptions or by multiple views? Comment on Biederman and Gerhardstein (1993). , 1995, Journal of experimental psychology. Human perception and performance.

[19]  S Edelman,et al.  A model of visual recognition and categorization. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[20]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[21]  M. Tarr News On Views: Pandemonium Revisited , 1999, Nature Neuroscience.

[22]  Heinrich H Bülthoff,et al.  Image-based object recognition in man, monkey and machine , 1998, Cognition.

[23]  Glyn W. Humphreys Dietmar Heinke,et al.  Spatial Representation and Selection in the Brain: Neuropsychological and Computational Constraints , 1998 .

[24]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[25]  Tomaso A. Poggio,et al.  Machine Learning, Machine Vision, and the Brain , 1999, AI Mag..

[26]  I Biederman,et al.  Neurocomputational bases of object and face recognition. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[27]  John E. Hummel,et al.  Two Roles for Attention in Shape Perception: A Structural Description Model of Visual Scrutiny , 1998 .

[28]  Allen M. Waxman,et al.  Learning Aspect Graph Representations from View Sequences , 1989, NIPS.

[29]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.