3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints

This paper presents a representation for three-dimensional objects in terms of affine-invariant image patches and their spatial relationships. Multi-view constraints associated with groups of patches are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true three-dimensional affine and Euclidean models from multiple images and their recognition in a single photograph taken from an arbitrary viewpoint. The proposed approach does not require a separate segmentation stage and is applicable to cluttered scenes. Preliminary modeling and recognition results are presented.

[1]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[2]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[3]  Andrew Zisserman,et al.  Applications of Invariance in Computer Vision , 1993, Lecture Notes in Computer Science.

[4]  Daphna Weinshall,et al.  Linear and incremental acquisition of invariant shape models from image sequences , 1993, 1993 (4th) International Conference on Computer Vision.

[5]  Hiroshi Murase,et al.  Learning and recognition of 3D objects from appearance , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[6]  J.B. Burns,et al.  View Variation of Point-Set and Line-Segment Features , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Takeo Kanade,et al.  A Paraperspective Factorization Method for Shape and Motion Recovery , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Andrew Zisserman,et al.  Wide baseline stereo matching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[10]  Andrew E. Johnson,et al.  Surface matching for object recognition in complex three-dimensional scenes , 1998, Image Vis. Comput..

[11]  Andrea Salgian,et al.  A Perceptual Grouping Hierarchy for Appearance-Based 3D Object Recognition , 1999, Comput. Vis. Image Underst..

[12]  O. Faugeras,et al.  The Geometry of Multiple Images , 1999 .

[13]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[14]  Stefan Carlsson,et al.  Wide Baseline Point Matching Using Affine Invariants Computed from Intensity Profiles , 2000, ECCV.

[15]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[16]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[17]  Adam Baumberg,et al.  Reliable feature matching across widely separated views , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[18]  Jean Ponce,et al.  On Computing Metric Upgrades of Projective Reconstructions Under the Rectangular Pixel Assumption , 2000, SMILE.

[19]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[20]  David G. Lowe,et al.  Local feature view clustering for 3D object recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[21]  C. Schmid,et al.  Indexing based on scale invariant interest points , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[22]  Andrew Zisserman,et al.  Multi-view Matching for Unordered Image Sets, or "How Do I Organize My Holiday Snaps?" , 2002, ECCV.

[23]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[24]  David G. Lowe,et al.  Probabilistic Models of Appearance for 3-D Object Recognition , 2000, International Journal of Computer Vision.

[25]  David G. Lowe,et al.  The viewpoint consistency constraint , 2015, International Journal of Computer Vision.

[26]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[27]  Cordelia Schmid,et al.  Segmenting, modeling, and matching video clips containing multiple moving objects , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..