Toward True 3 D Object Recognition

This paper addresses the problem of recognizing three-dimensional (3D) objects in photographs and image sequences. It revisits viewpoint invariants as a local representation of shape and appearance, and proposes a unified framework for object recognition where object models consist of a collection of small (planar) patches, their invariants, and a description of their 3D spatial relationship. This approach is applied to two fundamental instances of the 3D object recognition problem: (1) modeling rigid 3D objects from a small set of unregistered pictures and recognizing them in cluttered photographs taken from unconstrained viewpoints, and (2) recognizing non-uniform texture patterns despite appearance variations due to non-rigid transformations and changes in viewpoint. It is validated through several experiments, and extensions to the analysis of video sequences and the recognition of object categories are briefly discussed.

[1]  David G. Lowe,et al.  The viewpoint consistency constraint , 2015, International Journal of Computer Vision.

[2]  David J. Kriegman,et al.  Curve and Surface Duals and the Recognition of Curved 3D Objects from their Silhouettes , 2004, International Journal of Computer Vision.

[3]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[4]  Andrea J. van Doorn,et al.  The Structure of Locally Orderless Images , 1999, International Journal of Computer Vision.

[5]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[6]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[7]  Cordelia Schmid,et al.  Affine-invariant local descriptors and neighborhood statistics for texture recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  Cordelia Schmid,et al.  3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[10]  Martial Hebert,et al.  Combining Simple Discriminators for Object Discrimination , 2002, ECCV.

[11]  Andrew Zisserman,et al.  Classifying Images of Materials: Achieving Viewpoint and Illumination Independence , 2002, ECCV.

[12]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[13]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[14]  Martial Hebert,et al.  Object Recognition by a Cascade of Edge Probes , 2002, BMVC.

[15]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[16]  Thomas Serre,et al.  Component-based face detection , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Andrew Zisserman,et al.  Viewpoint invariant texture matching and wide baseline stereo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18]  Serge J. Belongie,et al.  Matching shapes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[20]  Dorin Comaniciu,et al.  Performance analysis in content-based retrieval with textures , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[21]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[22]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[23]  O. Faugeras,et al.  The Geometry of Multiple Images , 1999 .

[24]  Andrew E. Johnson,et al.  Surface matching for object recognition in complex three-dimensional scenes , 1998, Image Vis. Comput..

[25]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[26]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[27]  Takeo Kanade,et al.  Rotation Invariant Neural Network-Based Face Detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[28]  Tony Lindeberg,et al.  Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure , 1997, Image Vis. Comput..

[29]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Fang Liu,et al.  Periodicity, Directionality, and Randomness: Wold Features for Image Modeling and Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[32]  Takeo Kanade,et al.  A multi-body factorization method for motion analysis , 1995, Proceedings of IEEE International Conference on Computer Vision.

[33]  Ramakant Nevatia,et al.  From an intensity image to 3-D segmented descriptions , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[34]  Takeo Kanade,et al.  A Paraperspective Factorization Method for Shape and Motion Recovery , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Fang Liu,et al.  Real-time recognition with the entire Brodatz texture database , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Hiroshi Murase,et al.  Learning and recognition of 3D objects from appearance , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[37]  J.B. Burns,et al.  View Variation of Point-Set and Line-Segment Features , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[39]  Andrew Zisserman,et al.  Applications of Invariance in Computer Vision , 1993, Lecture Notes in Computer Science.

[40]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[41]  T. Boult,et al.  Factorization-based segmentation of motions , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[42]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[43]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Jean Ponce,et al.  Invariant Properties of Straight Homogeneous Generalized Cylinders and Their Contours , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  V. S. Nalwa Line drawing interpretation: bilateral symmetry , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[46]  Ronen Basri,et al.  The Alignment Of Objects With Smooth Surfaces , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[47]  Rodney A. Brooks,et al.  Symbolic Reasoning Among 3-D Models and 2-D Images , 1981, Artif. Intell..

[48]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.