End-to-End Visual Domain Adaptation Network for Cross-Domain 3D CPS Data Retrieval

3D CPS (Cyber Physical System) data has been widely generated and utilized for multiple applications, e.g. autonomous driving, unmanned aerial vehicle and so on. For large-scale 3D CPS data analysis, 3D object retrieval plays a significant role for urban perception. In this paper, we propose an end-to-end domain adaptation framework for cross-domain 3D objects retrieval (C3DOR-Net), which learns a joint embedding space for 3D objects from different domains in an end-to-end manner. Specifically, we focus on the unsupervised case when 3D objects in the target domain are unlabeled. To better encode a 3D object, the proposed method learns multi-view visual features in a data-driven manner for 3D object representation. Then, the domain adaptation strategy is implemented to benefit both domain alignment and final classification. Specifically, an center-based discriminative feature learning method enables the domain invariant features with better intra-class compactness and inter-class separability. C3DOR-Net can achieve remarkable retrieval performances by maximizing the inter-class divergence and minimizing the intra-class divergence. We evaluate our method on two cross-domain protocols: 1) CAD-to-CAD object retrieval on two popular 3D datasets (NTU and PSB) in three designed cross-domain scenarios; 2) SHREC’19 monocular image based 3D object retrieval. Experimental results demonstrate that our method can significantly boost the cross-domain retrieval performances.

[1]  Asako Kanezaki,et al.  RotationNet: Learning Object Classification Using Unsupervised Viewpoint Estimation , 2016, ArXiv.

[2]  An-An Liu,et al.  3D Object Retrieval Based on Multi-View Latent Variable Model , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[4]  Yu-Ting Su,et al.  View-Based 3-D Model Retrieval: A Benchmark , 2018, IEEE Transactions on Cybernetics.

[5]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[6]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[7]  Zhichao Zhou,et al.  DeepPano: Deep Panoramic Representation for 3-D Shape Recognition , 2015, IEEE Signal Processing Letters.

[8]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Ling Shao,et al.  Learning View-Model Joint Relevance for 3D Object Retrieval , 2015, IEEE Transactions on Image Processing.

[10]  Chao Chen,et al.  Joint Domain Alignment and Discriminative Feature Learning for Unsupervised Deep Domain Adaptation , 2018, AAAI.

[11]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[12]  Leonidas J. Guibas,et al.  FPNN: Field Probing Neural Networks for 3D Data , 2016, NIPS.

[13]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Dumitru Erhan,et al.  Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Ryutarou Ohbuchi,et al.  Salient local visual features for shape-based 3D model retrieval , 2008, 2008 IEEE International Conference on Shape Modeling and Applications.

[17]  Yue Gao,et al.  View-Based 3D Object Retrieval: Challenges and Approaches , 2014, IEEE MultiMedia.

[18]  Xing Liu,et al.  Monocular Image Based 3D Model Retrieval , 2019, 3DOR@Eurographics.

[19]  Mohamed Daoudi,et al.  A Bayesian 3-D Search Engine Using Adaptive Views Clustering , 2007, IEEE Transactions on Multimedia.

[20]  Min Xu,et al.  Learning Multi-view Deep Features for Small Object Retrieval in Surveillance Scenarios , 2015, ACM Multimedia.

[21]  Remco C. Veltkamp,et al.  Polyhedral Model Retrieval Using Weighted Point Sets , 2003, Int. J. Image Graph..

[22]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[23]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.

[24]  Hai Jin,et al.  Content-Based Visual Landmark Search via Multimodal Hypergraph Learning , 2015, IEEE Transactions on Cybernetics.

[25]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ling Shao,et al.  Transfer Learning for Visual Categorization: A Survey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Kate Saenko,et al.  Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[28]  Yue Gao,et al.  3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[29]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[30]  Xiaoqin Zhang,et al.  Exemplar-Based Denoising: A Unified Low-Rank Recovery Framework , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  Rita Cucchiara,et al.  Similarity-Based Retrieval with MPEG-7 3D Descriptors: Performance Evaluation on the Princeton Shape Benchmark , 2007, DELOS.

[32]  Yue Gao,et al.  3D model retrieval using weighted bipartite graph matching , 2011, Signal Process. Image Commun..

[33]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Qiang Wu,et al.  A Coarse-to-Fine Algorithm for Matching and Registration in 3D Cross-Source Point Clouds , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.