Self-Supervised Auxiliary Domain Alignment for Unsupervised 2D Image-Based 3D Shape Retrieval

Unsupervised 2D image-based 3D shape retrieval aims to match the similar 3D unlabeled shapes when given a 2D labeled sample. Although a lot of methods have made a certain degree of progress, the performance of this task is still restricted due to the lack of target labels resulting in tremendous domain gap. In this paper, we aim to explore the discriminative representation of the unlabeled target 3D shapes and facilitate the procedure of domain adaptation by taking full advantage of multi-view information. To achieve the above goals, we propose an effective self-supervised auxiliary domain alignment (SADA) for unsupervised 2D image-based 3D shape retrieval. SADA mainly contains multi-view guided self-supervised feature learning and two auxiliary domain alignments, including intermediate domain alignment and multi-domain alignment. Firstly, we group multiple views of each 3D shape into two sub-target domains based on the view similarities and regard each other as the constraint to optimize the feature learning in an unsupervised manner. To reduce the difficulty of directly aligning the domain discrepancy, we combine the source labeled samples and target samples (pseudo labels) with the same category to generate an intermediate domain, which translates the source-target alignment into source-intermediate and intermediate-target alignments. Moreover, to explore the inner characteristics of target 3D shapes and provide more clues for better adaptation, multi-domain alignment is proposed to convert the source and single target domain alignment to the source and multiple target domain (one target domain and two sub-target domains) alignments. The adversarial training and semantic alignment are employed to fully excavate the relations between source domain and multiple target domains. Experiments on two challenging datasets show that the proposed method achieves competing performance in the unsupervised 2D image-based 3D shape retrieval task.

[1]  Wei-Lun Chao,et al.  Gradual Domain Adaptation without Indexed Intermediate Domains , 2022, NeurIPS.

[2]  Yueting Zhuang,et al.  Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies , 2021, Frontiers of Information Technology & Electronic Engineering.

[3]  Jie Yang,et al.  Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Ling-Yu Duan,et al.  IDM: An Intermediate Domain Module for Domain Adaptive Person Re-ID , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Yue Gao,et al.  DAN: Deep-Attention Network for 3D Shape Recognition , 2021, IEEE Transactions on Image Processing.

[6]  Jiashi Feng,et al.  Source Data-Absent Unsupervised Domain Adaptation Through Hypothesis Transfer and Labeling Transfer , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Wonjun Hwang,et al.  FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Hamed Pirsiavash,et al.  CompRess: Self-Supervised Learning by Compressing Representations , 2020, NeurIPS.

[9]  Dacheng Tao,et al.  Hard Example Generation by Texture Synthesis for Cross-domain Shape Similarity Learning , 2020, NeurIPS.

[10]  An-An Liu,et al.  Semantic Consistency Guided Instance Feature Alignment for 2D Image-Based 3D Shape Retrieval , 2020, ACM Multimedia.

[11]  Yuqian Li,et al.  Joint Heterogeneous Feature Learning and Distribution Alignment for 2D Image-Based 3D Object Retrieval , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Yingli Tian,et al.  Cross-modal Center Loss , 2020, ArXiv.

[13]  Jiashi Feng,et al.  Domain Adaptation with Auxiliary Target Domain-Oriented Classifier , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Wenhui Li,et al.  Hierarchical Instance Feature Alignment for 2D Image-Based 3D Shape Retrieval , 2020, IJCAI.

[15]  Yuqian Li,et al.  Consistent Domain Structure Learning and Domain Alignment for 2D Image-Based 3D Objects Retrieval , 2020, IJCAI.

[16]  Anan Liu,et al.  Multi-View Saliency Guided Deep Neural Network for 3-D Object Retrieval and Classification , 2020, IEEE Transactions on Multimedia.

[17]  Yunbo Wang,et al.  Progressive Adversarial Networks for Fine-Grained Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jinjun Xiong,et al.  Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Qingming Huang,et al.  Gradually Vanishing Bridge for Adversarial Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Lei Zhang,et al.  Probability Weighted Compact Feature for Domain Adaptive Retrieval , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jiashi Feng,et al.  Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation , 2020, ICML.

[22]  Bingbing Ni,et al.  Adversarial Domain Adaptation with Domain Mixup , 2019, AAAI.

[23]  Weizhi Nie,et al.  Dual-level Embedding Alignment Network for 2D Image-Based 3D Object Retrieval , 2019, ACM Multimedia.

[24]  Blaž Zupan,et al.  openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding , 2019, bioRxiv.

[25]  Peter M. Roth,et al.  Location Field Descriptors: Single Image 3D Model Retrieval in the Wild , 2019, 2019 International Conference on 3D Vision (3DV).

[26]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Yun Ma,et al.  Virtual Mixup Training for Unsupervised Domain Adaptation , 2019, ArXiv.

[28]  Chunxia Xiao,et al.  PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Biao Leng,et al.  Angular Triplet-Center Loss for Multi-view 3D Shape Retrieval , 2018, AAAI.

[30]  Matthias Zwicker,et al.  Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences , 2018, AAAI.

[31]  Philip S. Yu,et al.  Visual Domain Adaptation with Manifold Embedded Distribution Alignment , 2018, ACM Multimedia.

[32]  Chuan Chen,et al.  Learning Semantic Representations for Unsupervised Domain Adaptation , 2018, ICML.

[33]  Dong Xu,et al.  Collaborative and Adversarial Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Ian J. Wassell,et al.  Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Cheng Zhang,et al.  Emphasizing 3D Properties in Recurrent Multi-View Aggregation for 3D Shape Retrieval , 2018, AAAI.

[36]  Jiajun Wu,et al.  Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[38]  Ke Lu,et al.  Structured Domain Adaptation , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Jing Zhang,et al.  Joint Geometrical and Statistical Alignment for Visual Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[42]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[43]  Shuicheng Yan,et al.  Hybrid CNN and Dictionary-Based Models for Scene Recognition and Domain Adaptation , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[45]  Leonidas J. Guibas,et al.  Joint embeddings of shapes and images via CNN image purification , 2015, ACM Trans. Graph..

[46]  Mathieu Aubry,et al.  Understanding Deep Features with Computer-Generated Imagery , 2015, ICCV.

[47]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[48]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[49]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[50]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[51]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.

[52]  Xing Liu,et al.  Monocular Image Based 3D Model Retrieval , 2019, 3DOR@Eurographics.