Human image understanding: Recent research and a theory

The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into simple volumetric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recognition-by-components (RBC), is that a modest set of components [ N probably ≤ 36] can be derived from contrasts of five readily detectable properties of edges in a 2-dimensional image: curvature, collinearity, symmetry, parallelism, and cotermination. The detection of these properties is generally invariant over viewing position and image quality and consequently allows robust object perception when the image is projected from a novel viewpoint or degraded. RBC thus provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition: The constraints toward regularization (Pragnanz) characterize not the complete object but the object's components. A principle of componential recovery can account for the major phenomena of object recognition: If an arrangement of two or three primitive components can be recovered from the input, objects can be quickly recognized even when they are occluded, rotated in depth, novel, or extensively degraded. The results from experiments on the perception of briefly presented pictures by human observers provide empirical support for the theory.

[1]  Ramakant Nevatia,et al.  Description and Recognition of Curved Objects , 1977, Artif. Intell..

[2]  Lawrence C. Sager,et al.  Perception of wholes and of their component parts: some configural superiority effects. , 1977, Journal of experimental psychology. Human perception and performance.

[3]  B. Tversky,et al.  Objects, parts, and categories. , 1984 .

[4]  U. Neisser Decision-time without reaction-time: Experiments in visual scanning. , 1963 .

[5]  R. Shepard,et al.  Mental Rotation of Three-Dimensional Objects , 1971, Science.

[6]  K. Sugihara Classification of Impossible Objects , 1982, Perception.

[7]  K. A. Hildebrandt The Role of Physical Appearance in Infant and Child Development , 1982 .

[8]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[9]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[10]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[11]  D. Bartram Levels of coding in picture-picture comparison tasks , 1976, Memory & cognition.

[12]  S. Palmer The Psychology of Perceptual Organization: A Transformational Approach , 1983 .

[13]  A. Witkin,et al.  On the Role of Structure in Vision , 1983 .

[14]  D. Marr,et al.  Analysis of occluding contour , 1977, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[15]  Judith F. Kroll,et al.  Recognizing words, pictures, and concepts: A comparison of lexical, object, and reality decisions , 1984 .

[16]  Howard E. Egeth,et al.  Multidimensional stimulus identification , 1969 .

[17]  G. Miller Spontaneous Apprentices: Children and Language , 1977 .

[18]  J. Todd,et al.  Describing perceptual information about human growth in terms of geometric invariants , 1985, Perception & psychophysics.

[19]  W. R. Garner The Processing of Information and Structure , 1974 .

[20]  Dana H. Ballard,et al.  Computer Vision , 1982 .

[21]  S. Palmer What makes triangles point: Local and global effects in configurations of ambiguous triangles , 1980, Cognitive Psychology.

[22]  D. Perkins Why the Human Perceiver Is a Bad Machine , 1983 .

[23]  V. Virsu Tendencies to eye movement, and misperception of curvature, direction, and length , 1971 .

[24]  T. A. Ryan,et al.  Speed of perception as a function of mode of representation. , 1956, The American journal of psychology.

[25]  James L. McClelland,et al.  An interactive activation model of context effects in letter perception: I. An account of basic findings. , 1981 .

[26]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[27]  London,et al.  The Ames Demonstrations in Perception , 1953 .

[28]  Takeo Kanade,et al.  Recovery of the Three-Dimensional Shape of an Object from a Single View , 1981, Artif. Intell..

[29]  R. C. Oldfield Things, Words and the Brain* , 1966, The Quarterly journal of experimental psychology.

[30]  V. Virsu Underestimation of curvature and task dependence in visual perception of form , 1971 .

[31]  Stephen M. Kosslyn,et al.  Pictures and names: Making the connection , 1984, Cognitive Psychology.

[32]  G. Humphreys Reference frames and shape perception , 1983, Cognitive Psychology.

[33]  B. Julesz Textons, the elements of texture perception, and their interactions , 1981, Nature.

[34]  S. F. Checkosky,et al.  Effects of pattern goodness on recognition time in a memory search task. , 1973, Journal of experimental psychology.

[35]  Donald D. Hoffman,et al.  Parts of recognition , 1984, Cognition.

[36]  Thomas O. Binford,et al.  Inferring Surfaces from Images , 1981, Artif. Intell..

[37]  B. Fildes,et al.  The on effect of changes in curve geometry magnitude estimates of road-like perspective curvature , 1985 .

[38]  K. A. Hildebrandt,et al.  The infant's physical attractiveness: Its effect on bonding and attachment , 1983 .

[39]  R. G. Coss,et al.  Delayed plasticity of an instinct: recognition and avoidance of 2 facing eyes by the jewel fish. , 1979, Developmental psychobiology.

[40]  A. Rosenfeld,et al.  A Theory of Textural Segmentation , 1983 .

[41]  D. Perkins,et al.  A Cross-Cultural Comparison of the Use of a Gestalt Perceptual Strategy , 1982, Perception.

[42]  A. Treisman Perceptual grouping and attention in visual search for features and for objects. , 1982, Journal of experimental psychology. Human perception and performance.

[43]  Harry G. Barrow,et al.  Interpreting Line Drawings as Three-Dimensional Surfaces , 1980, Artif. Intell..

[44]  I. Biederman,et al.  Shape constancy and a perceptual bias towards symmetry , 1976 .

[45]  J. R. Pomerantz Pattern goodness and speed of encoding , 1977, Memory & cognition.

[46]  A Wingfield,et al.  Response Latencies in Naming Objects , 1965, The Quarterly journal of experimental psychology.

[47]  A. Tversky Features of Similarity , 1977 .

[48]  S. Carey The child as word learner , 1978 .

[49]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[50]  Dave Bartram,et al.  The role of visual and semantic codes in object naming , 1974 .

[51]  Indranil Chakravarty,et al.  A Generalized Line and Junction Labeling Scheme with Application to scene Analysis , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Kokichi Sugihara,et al.  An Algebraic Approach to Shape-from-Image Problems , 1984, Artif. Intell..

[53]  F. Bartlett,et al.  Remembering: A Study in Experimental and Social Psychology , 1932 .

[54]  R. Penrose,et al.  Impossible objects: a special type of visual illusion. , 1958, British journal of psychology.