Joint Heterogeneous Feature Learning and Distribution Alignment for 2D Image-Based 3D Object Retrieval

2D image-based 3D object retrieval is a novel but challenging task for 3D object retrieval. In this paper, we propose a 2D image-based 3D object retrieval method via joint heterogeneous feature learning and distribution alignment. Specifically, we propose to learn a mapping function in the Grassmann manifold to reduce the divergence of heterogeneous features of 2D images and 3D objects. We further employ the data distribution alignment method to adaptively integrate both marginal and conditional distributions. We embed both terms into the objective function to learn a domain-invariant classifier based on structural risk minimization. The output domain-invariant features of 2D images and 3D objects can be utilized for 2D image-based 3D object retrieval. Since there is lack of large-scale dataset for the evaluation of this task, we build two new datasets, MI3DOR and MI3DOR-2. We compare the proposed method against the representative methods for domain adaption and explore the influence of different components of the objective functions and key parameter. Comparison experiments show the superiority of this method.

[1]  Yiqiang Chen,et al.  Balanced Distribution Adaptation for Transfer Learning , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[2]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Qi Tian,et al.  GIFT: Towards Scalable 3D Shape Retrieval , 2017, IEEE Transactions on Multimedia.

[4]  Winston H. Hsu,et al.  Cross-Domain Image-Based 3D Shape Retrieval by View Sequence Learning , 2018, 2018 International Conference on 3D Vision (3DV).

[5]  Junsong Yuan,et al.  Multi-view Harmonized Bilinear Network for 3D Object Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[8]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Jianhua Lu,et al.  Robust Monocular 3D Car Shape Estimation From 2D Landmarks , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Bo Li,et al.  Large-Scale 3D Shape Retrieval from ShapeNet Core55 , 2016, 3DOR@Eurographics.

[12]  Longin Jan Latecki,et al.  GIFT: A Real-Time and Scalable 3D Shape Search Engine , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Philip S. Yu,et al.  Visual Domain Adaptation with Manifold Embedded Distribution Alignment , 2018, ACM Multimedia.

[14]  Longin Jan Latecki,et al.  3D Shape Matching via Two Layer Coding , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[16]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[17]  Yue Gao,et al.  Hyper-Clique Graph Matching and Applications , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[19]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[20]  Ming Zeng,et al.  Joint analysis of shapes and images via deep domain adaptation , 2018, Comput. Graph..

[21]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[22]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  An-An Liu,et al.  3D Object Retrieval Based on Multi-View Latent Variable Model , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Yi Fang,et al.  Siamese CNN-BiLSTM Architecture for 3D Shape Representation Learning , 2018, IJCAI.

[25]  Jing Zhang,et al.  Joint Geometrical and Statistical Alignment for Visual Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Xuelong Li,et al.  Discriminative Transfer Subspace Learning via Low-Rank and Sparse Representation , 2016, IEEE Transactions on Image Processing.

[27]  Song Bai,et al.  Triplet-Center Loss for Multi-view 3D Object Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Minh N. Do,et al.  2D Image-Based 3D Scene Retrieval , 2018, 3DOR@Eurographics.

[29]  Leonidas J. Guibas,et al.  FPNN: Field Probing Neural Networks for 3D Data , 2016, NIPS.

[30]  Yu-Ting Su,et al.  View-Based 3-D Model Retrieval: A Benchmark , 2018, IEEE Transactions on Cybernetics.

[31]  Jafar Tahmoresnezhad,et al.  Visual domain adaptation via transfer feature learning , 2017, Knowledge and Information Systems.

[32]  Ke Lu,et al.  Structured Domain Adaptation , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Yi Yang,et al.  Contrastive Adaptation Network for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Chao Chen,et al.  Joint Domain Alignment and Discriminative Feature Learning for Unsupervised Deep Domain Adaptation , 2018, AAAI.

[35]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[36]  Wenhui Li,et al.  Hierarchical Graph Structure Learning for Multi-View 3D Model Retrieval , 2018, IJCAI.

[37]  Yue Gao,et al.  GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Wenhui Li,et al.  Cross-Domain 3D Model Retrieval via Visual Domain Adaption , 2018, IJCAI.

[39]  Brian C. Lovell,et al.  Domain Adaptation on the Statistical Manifold , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Shuicheng Yan,et al.  Hybrid CNN and Dictionary-Based Models for Scene Recognition and Domain Adaptation , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[41]  Bo Li,et al.  Extended Large Scale Sketch-Based 3D Shape Retrieval , 2014, 3DOR@Eurographics.

[42]  Philip S. Yu,et al.  Adaptation Regularization: A General Framework for Transfer Learning , 2014, IEEE Transactions on Knowledge and Data Engineering.

[43]  Chuan Chen,et al.  Learning Semantic Representations for Unsupervised Domain Adaptation , 2018, ICML.

[44]  Yu-Chiang Frank Wang,et al.  Unsupervised Domain Adaptation With Label and Structural Consistency , 2016, IEEE Transactions on Image Processing.

[45]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[46]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.