论文信息 - Learning Category-Specific 3D Shape Models from Weakly Labeled 2D Images

Learning Category-Specific 3D Shape Models from Weakly Labeled 2D Images

Recently, researchers have made great processes to build category-specific 3D shape models from 2D images with manual annotations consisting of class labels, keypoints, and ground truth figure-ground segmentations. However, the annotation of figure-ground segmentations is still labor-intensive and time-consuming. To further alleviate the burden of providing such manual annotations, we make the earliest effort to learn category-specific 3D shape models by only using weakly labeled 2D images. By revealing the underlying relationship between the tasks of common object segmentation and category-specific 3D shape reconstruction, we propose a novel framework to jointly solve these two problems along a cluster-level learning curriculum. Comprehensive experiments on the challenging PASCAL VOC benchmark demonstrate that the category-specific 3D shape models trained using our weakly supervised learning framework could, to some extent, approach the performance of the state-of-the-art methods using expensive manual segmentation annotations. In addition, the experiments also demonstrate the effectiveness of using 3D shape models for helping common object segmentation.

[1] ZissermanAndrew,et al. The Pascal Visual Object Classes Challenge , 2015 .

[2] Lawrence G. Roberts,et al. Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.

[3] Henning Biermann,et al. Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[4] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[5] Jitendra Malik,et al. Category-specific object reconstruction from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Feiping Nie,et al. Object Co-segmentation via Graph Optimized-Flexible Manifold Ranking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Ling Shao,et al. Cosaliency Detection Based on Intrasaliency Prior Transfer and Deep Intersaliency Mining , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[8] S. Stigler. Francis Galton's Account of the Invention of Correlation , 1989 .

[9] Bernt Schiele,et al. Detailed 3D Representations for Object Recognition and Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Francis Schmitt,et al. Silhouette and stereo fusion for 3D object modeling , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[11] Xinlei Chen,et al. Webly Supervised Learning of Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12] Jitendra Malik,et al. Discriminative Decorrelation for Clustering and Classification , 2012, ECCV.

[13] Edward H. Adelson,et al. Playing with Puffball: simple scale-invariant inflation for use in vision and graphics , 2012, SAP '12.

[14] Max Jaderberg,et al. Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[15] Alexei A. Efros,et al. Learning Dense Correspondence via 3D-Guided Cycle Consistency , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Vladlen Koltun,et al. Single-view reconstruction via joint analysis of image and shape collections , 2015, ACM Trans. Graph..

[17] Lei Guo,et al. Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[18] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19] David W. Jacobs,et al. WarpNet: Weakly Supervised Matching for Single-View Reconstruction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Touradj Ebrahimi,et al. MESH: measuring errors between surfaces using the Hausdorff distance , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[21] Jean Ponce,et al. Discriminative clustering for image co-segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22] Daniel Cremers,et al. Fast Joint Estimation of Silhouettes and Dense 3D Geometry from Multiple Images , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] Jitendra Malik,et al. Color Constancy, Intrinsic Images, and Shape Estimation , 2012, ECCV.

[24] Sebastian Thrun,et al. SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[25] Alexei A. Efros,et al. Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Yunchao Wei,et al. STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Vladimir Kolmogorov,et al. Graph cut based image segmentation with connectivity priors , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Xuelong Li,et al. Detection of Co-salient Objects by Looking Deep and Wide , 2016, International Journal of Computer Vision.

[29] Deyu Meng,et al. Bridging Saliency Detection to Weakly Supervised Object Detection Based on Self-Paced Curriculum Learning , 2016, IJCAI.

[30] Jitendra Malik,et al. Learning Category-Specific Deformable 3D Models for Object Reconstruction , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] Leonidas J. Guibas,et al. Estimating image depth using shape collections , 2014, ACM Trans. Graph..

[32] Silvio Savarese,et al. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[33] Peter V. Gehler,et al. 3D object class detection in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[35] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[36] Xinlei Chen,et al. Enriching Visual Knowledge Bases via Object Discovery and Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Xueming Qian,et al. Semantic Annotation of High-Resolution Satellite Images via Weakly Supervised Learning , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[38] Silvio Savarese,et al. Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[39] Lourdes Agapito,et al. Lifting Object Detection Datasets into 3D , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] David G. Lowe,et al. Three-Dimensional Object Recognition from Single Two-Dimensional Images , 1987, Artif. Intell..

[41] Lourdes Agapito,et al. Reconstructing PASCAL VOC , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.