Flexible Object Models for Category-Level 3D Object Recognition

Today's category-level object recognition systems largely focus on fronto-parallel views of objects with characteristic texture patterns. To overcome these limitations, we propose a novel framework for visual object recognition where object classes are represented by assemblies of partial surface models (PSMs) obeying loose local geometric constraints. The PSMs themselves are formed of dense, locally rigid assemblies of image features. Since our model only enforces local geometric consistency, both at the level of model parts and at the level of individual features within the parts, it is robust to viewpoint changes and intra-class variability. The proposed approach has been implemented, and it outperforms the state-of-the-art algorithms for object detection and localization recently compared in [14] on the Pascal 2005 VOC Challenge Cars Test 1 data.

[1]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[2]  D. G. Simpson,et al.  The Statistical Analysis of Discrete Data , 1989 .

[3]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[4]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[5]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[8]  C. Schmid,et al.  Scale-invariant shape features for recognition of object categories , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  T. Tuytelaars,et al.  Integrating multiple model views for object recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Cordelia Schmid,et al.  A maximum entropy framework for part-based texture and object recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  David A. Forsyth,et al.  Efficient Unsupervised Learning for Localization and Detection in Object Categories , 2005, NIPS.

[13]  Cordelia Schmid,et al.  3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints , 2006, International Journal of Computer Vision.

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Cordelia Schmid,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[16]  Luc Van Gool,et al.  Towards Multi-View Object Class Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Luc Van Gool,et al.  Edinburgh Research Explorer Simultaneous Object Recognition and Segmentation by Image Exploration , 2022 .

[18]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[19]  Jean Ponce,et al.  Modeling 3D Objects from Stereo Views and Recognizing Them in Photographs , 2006, ECCV.