Recognition by prototypes

A scheme for recognizing 3D objects from single 2D images under orthographic projection is introduced. The scheme proceeds in two stages. In the first stage, the categorization stage, the image is compared to prototype objects. For each prototype, the view that most resembles the image is recovered, and, if the view is found to be similar to the image, the class identity of the object is determined. In the second stage, the identification stage, the observed object is compared to the individual models of its class, where classes are expected to contain objects with relatively similar shapes. For each model, a view that matches the image is sought. If such a view is found, the object's specific identity is determined. The advantage of categorizing the object before it is identified is twofold. First, the image is compared to a smaller number of models, since only models that belong to the object's class need to be considered. Second, the cost of comparing the image to each model in a class is very low, because correspondence is computed once for the whole class. More specifically, the correspondence and object pose computed in the categorization stage to align the prototype with the image are reused in the identification stage to align the individual models with the image. As a result, identification is reduced to a series of simple template comparisons. The paper concludes with an algorithm for constructing optimal prototypes for classes of objects.

[1]  Michael R. Lowry,et al.  Learning Physical Descriptions From Functional Definitions, Examples, and Precedents , 1983, AAAI.

[2]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[3]  Russell H. Taylor,et al.  Automatic Synthesis of Fine-Motion Strategies for Robots , 1984 .

[4]  Ronen Basri,et al.  Paraperspective ≡ affine , 1994, International Journal of Computer Vision.

[5]  L. Stark,et al.  Dissertation Abstract , 1994, Journal of Cognitive Education and Psychology.

[6]  J. K. Aggarwal,et al.  SHAPE RECOGNITION FROM SINGLE SILHOUETTES. , 1987 .

[7]  Isaac Weiss,et al.  Projective invariants of shapes , 1988, Proceedings CVPR '88: The Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  T. Poggio,et al.  Recognition and Structure from one 2D Model View: Observations on Prototypes, Object Classes and Symmetries , 1992 .

[9]  R. Bajcsy,et al.  Three dimensional object representation revisited , 1987 .

[10]  I. Biederman Human image understanding: Recent research and a theory , 1985, Computer Vision Graphics and Image Processing.

[11]  A. Pentland Recognition by Parts , 1987 .

[12]  M. Kilwein,et al.  Basic objects in natural categories revisited : a replication with sighted and blind college students / , 1993 .

[13]  Yehezkel Lamdan,et al.  On recognition of 3-D objects from 2-D images , 2011, Proceedings. 1988 IEEE International Conference on Robotics and Automation.

[14]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[15]  D. W. Thompson,et al.  Three-dimensional model matching from an unconstrained viewpoint , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[16]  M. Hebert,et al.  The Representation, Recognition, and Locating of 3-D Objects , 1986 .

[17]  Ronen Basri,et al.  The Alignment Of Objects With Smooth Surfaces , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[18]  Michael Brady,et al.  Generating and Generalizing Models of Visual Objects , 1987, Artif. Intell..

[19]  W. Eric L. Grimson,et al.  The combinatorics of local constraints in model-based recognition and localization from sparse data , 1984, JACM.

[20]  Azriel Rosenfeld,et al.  Recognition by Functional Parts , 1995, Comput. Vis. Image Underst..

[21]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[22]  David A. Forsyth,et al.  Invariant Descriptors for 3D Object Recognition and Pose , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  W. Grimson,et al.  Model-Based Recognition and Localization from Sparse Range or Tactile Data , 1984 .

[24]  L. Vaina,et al.  The largest convex patches: A boundary-based method for obtaining object parts , 2004, Biological Cybernetics.

[25]  S. Ullman Aligning pictorial descriptions: An approach to object recognition , 1989, Cognition.

[26]  Donald D. Hoffman,et al.  Parts of recognition , 1984, Cognition.

[27]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[28]  Daphna Weinshall Model-based invariants for 3-D vision , 2005, International Journal of Computer Vision.

[29]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[30]  J. Koenderink,et al.  The Shape of Smooth Objects and the Way Contours End , 1982, Perception.

[31]  David W. Jacobs Space efficient 3-D model indexing , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Richard G. Kurial,et al.  Representation and recognition , 1990 .

[33]  Shimon Ullman,et al.  Recognizing solid objects by alignment with an image , 1990, International Journal of Computer Vision.

[34]  Su-Shing Chen,et al.  Three-Dimensional Object Recognition Using Range Data , 1989, Other Conferences.

[35]  L. Uhr,et al.  Representing and using functional definitions for visual recognition , 1987 .

[36]  Shimon Ullman,et al.  A Pictorial Approach to Object Classification , 1991, IJCAI.

[37]  D. Jacobs Space Efficient 3D Model Indexing , 1992 .

[38]  Daphna Weinshall Model-based invariants for 3D vision , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[39]  C. Tomasi,et al.  Factoring image sequences into shape and motion , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[40]  Ronen Basri,et al.  Viewer-Centered Representations in Object Recognition: a Computational Approach , 1993, Handbook of Pattern Recognition and Computer Vision.

[41]  David G. Lowe,et al.  Three-Dimensional Object Recognition from Single Two-Dimensional Images , 1987, Artif. Intell..

[42]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[43]  Larry S. Davis,et al.  Shape Matching Using Relaxation Techniques , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[46]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.