Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval

Zero-Shot Sketch-Based Image Retrieval (ZSSBIR) is a novel cross-modal retrieval task, where abstract sketches are used as queries to retrieve natural images under zero-shot scenario. Most existing methods regard ZS-SBIR as a traditional classification problem and employ a cross-entropy or triplet-based loss to achieve retrieval, which neglect the problems of the domain gap between sketches and natural images and the large intra-class diversity in sketches. Toward this end, we propose a novel Domain-Smoothing Network (DSN) for ZS-SBIR. Specifically, a cross-modal contrastive method is proposed to learn generalized representations to smooth the domain gap by mining relations with additional augmented samples. Furthermore, a category-specific memory bank with sketch features is explored to reduce intra-class diversity in the sketch domain. Extensive experiments demonstrate that our approach notably outperforms the state-of-the-art methods in both Sketchy and TUBerlin datasets.

[1]  Qing Liu,et al.  Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Ling Shao,et al.  Zero-Shot Sketch-Image Hashing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[4]  Cordelia Schmid,et al.  Label-Embedding for Image Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Tao Xiang,et al.  Sketch Less for More: On-the-Fly Fine-Grained Sketch-Based Image Retrieval , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Kun Wei,et al.  Lifelong Zero-Shot Learning , 2020, IJCAI.

[7]  Christoph H. Lampert,et al.  Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Hao Wang,et al.  Progressive Domain-Independent Feature Decomposition Network for Zero-Shot Sketch-Based Image Retrieval , 2020, IJCAI.

[9]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[10]  Xianglong Liu,et al.  Adversarial Fine-Grained Composition Learning for Unseen Attribute-Object Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Zhe Xu,et al.  Generalized Zero-Shot Learning via Disentangled Representation , 2021, AAAI.

[12]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Honggang Zhang,et al.  Sketch-based image retrieval via Siamese convolutional neural network , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[14]  Guodong Guo,et al.  Learning Large Euclidean Margin for Sketch-based Image Retrieval , 2018, ArXiv.

[15]  Yang Yang,et al.  Zero-Shot Hashing via Transferring Supervised Knowledge , 2016, ACM Multimedia.

[16]  Zeynep Akata,et al.  Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Gustavo Carneiro,et al.  Multi-modal Cycle-consistent Generalized Zero-Shot Learning , 2018, ECCV.

[18]  Bernt Schiele,et al.  Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[20]  Tao Xiang,et al.  Sketch-a-Net: A Deep Neural Network that Beats Humans , 2017, International Journal of Computer Vision.

[21]  Ling Shao,et al.  Deep Sketch Hashing: Fast Free-Hand Sketch-Based Image Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23]  Ling Shao,et al.  Generative Domain-Migration Hashing for Sketch-to-Image Retrieval , 2018, ECCV.

[24]  Xu Yang,et al.  Incremental Embedding Learning via Zero-Shot Translation , 2020, AAAI.