论文信息 - Representation and recognition of the spatial organization of three-dimensional shapes

Representation and recognition of the spatial organization of three-dimensional shapes

The human visual process can be studied by examining the computational problems associated with deriving useful information from retinal images. In this paper, we apply this approach to the problem of representing three-dimensional shapes for the purpose of recognition. 1. Three criteria, accessibility, scope and uniqueness, and stability and sensitivity, are presented for judging the usefulness of a representation for shape recognition. 2. Three aspects of a representation’s design are considered, (i) the representation’s coordinate system, (ii) its primitives, which are the primary units of shape information used in the representation, and (iii) the organization the representation imposes on the information in its descriptions. 3. In terms of these design issues and the criteria presented, a shape representation for recognition should: (i) use an object-centred coordinate system, (ii) include volumetric primitives of varied sizes, and (iii) have a modular organization. A representation based on a shape’s natural axes (for example the axes identified by a stick figure) follows directly from these choices. 4. The basic process for deriving a shape description in this representation must involve: (i) a means for identifying the natural axes of a shape in its image and (ii) a mechanism for transforming viewer-centred axis specifications to specifications in an object-centred coordinate system. 5. Shape recognition involves: (i) a collection of stored shape descriptions, and (ii) various indexes into the collection that allow a newly derived description to be associated with an appropriate stored description. The most important of these indexes allows shape recognition to proceed conservatively from the general to the specific based on the specificity of the information available from the image. 6. New constraints supplied by a conservative recognition process can be used to extract more information from the image. A relaxation process for carrying out this constraint analysis is described.

D. Marr | H. Nishihara

[1] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[2] Lawrence G. Roberts,et al. Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.

[3] James T. Tippett,et al. OPTICAL AND ELECTRO-OPTICAL INFORMATION PROCESSING, , 1965 .

[4] F. Attneave. Triangles as ambiguous figures. , 1968, The American journal of psychology.

[5] Patrick Henry Winston,et al. Learning structural descriptions from examples , 1970 .

[6] H. Blum. Biological shape and visual science (part I) , 1973 .

[7] A. Taylor,et al. The contribution of the right parietal lobe to object recognition. , 1973, Cortex; a journal devoted to the study of the nervous system and behavior.

[8] Marvin Minsky,et al. A framework for representing knowledge , 1974 .

[9] Marvin Minsky,et al. A framework for representing knowledge" in the psychology of computer vision , 1975 .

[10] D Marr,et al. Early processing of visual information. , 1976, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[11] D. Marr,et al. Analysis of occluding contour , 1977, Proceedings of the Royal Society of London. Series B. Biological Sciences.