论文信息 - Joint Embedding of 3D Scan and CAD Objects

Joint Embedding of 3D Scan and CAD Objects

3D scan geometry and CAD models often contain complementary information towards understanding environments, which could be leveraged through establishing a mapping between the two domains. However, this is a challenging task due to strong, lower-level differences between scan and CAD geometry. We propose a novel approach to learn a joint embedding space between scan and CAD geometry, where semantically similar objects from both domains lie close together. To achieve this, we introduce a new 3D CNN-based approach to learn a joint embedding space representing object similarities across these domains. To learn a shared space where scan objects and CAD models can interlace, we propose a stacked hourglass approach to separate foreground and background from a scan object, and transform it to a complete, CAD-like representation to produce a shared embedding space. This embedding space can then be used for CAD model retrieval; to further enable this task, we introduce a new dataset of ranked scan-CAD similarity annotations, enabling new, fine-grained evaluation of CAD model retrieval to cluttered, noisy, partial scans. Our learned joint embedding outperforms current state of the art for CAD model retrieval by 12% in instance retrieval accuracy.

[1] Hans-Peter Seidel,et al. LeSSS: Learned Shared Semantic Spaces for Relating Multi‐Modal Representations of 3D Shapes , 2015, SGP '15.

[2] Mathieu Aubry,et al. Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Matthias Nießner,et al. Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Leonidas J. Guibas,et al. Joint embeddings of shapes and images via CNN image purification , 2015, ACM Trans. Graph..

[5] Sven J. Dickinson,et al. Skeleton based shape matching and retrieval , 2003, 2003 Shape Modeling International..

[6] Jason Weston,et al. Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[7] Matthias Nießner,et al. Scan2CAD: Learning CAD Model Alignment in RGB-D Scans , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Bernard Chazelle,et al. Shape distributions , 2002, TOGS.

[9] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[10] Matthias Nießner,et al. Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[11] Vladlen Koltun,et al. Robust reconstruction of indoor scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Ulrich Neumann,et al. SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13] Matthias Nießner,et al. Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[14] Andrew W. Fitzgibbon,et al. KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[15] Jason Weston,et al. WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[16] Nico Blodow,et al. Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[17] Matthias Nießner,et al. BundleFusion , 2016, TOGS.

[18] Ryutarou Ohbuchi,et al. Shape-similarity search of 3D models by using enhanced shape functions , 2003, Proceedings of Theory and Practice of Computer Graphics, 2003..

[19] Leonidas J. Guibas,et al. Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Kavita Bala,et al. Learning visual similarity for product design with convolutional neural networks , 2015, ACM Trans. Graph..

[21] Ming Ouhyoung,et al. A 3D Object Retrieval System Based on Multi-Resolution Reeb Graph , 2002 .

[22] Wilmot Li,et al. Style compatibility for 3D furniture models , 2015, ACM Trans. Graph..

[23] Stefan Leutenegger,et al. ElasticFusion: Dense SLAM Without A Pose Graph , 2015, Robotics: Science and Systems.

[24] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[26] Ming Ouhyoung,et al. On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[27] Taku Komura,et al. Topology matching for fully automatic similarity estimation of 3D shapes , 2001, SIGGRAPH.

[28] Kate Saenko,et al. Learning Deep Object Detectors from 3D Models , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[29] Federico Tombari,et al. Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[30] Ariel Shamir,et al. Pose-Oblivious Shape Signature , 2007, IEEE Transactions on Visualization and Computer Graphics.

[31] Matthias Nießner,et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Andrew W. Fitzgibbon,et al. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.