Stacked Adversarial Network for Zero-Shot Sketch based Image Retrieval

Conventional approaches to Sketch-Based Image Retrieval (SBIR) assume that the data of all the classes are available during training. The assumption may not always be practical since the data of a few classes may be unavailable, or the classes may not appear at the time of training. Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) relaxes this constraint and allows the algorithm to handle previously unseen classes during the test. This paper proposes a generative approach based on the Stacked Adversarial Network (SAN) and the advantage of Siamese Network (SN) for ZS-SBIR. While SAN generates a high-quality sample, SN learns a better distance metric compared to that of the nearest neighbor search. The capability of the generative model to synthesize image features based on the sketch reduces the SBIR problem to that of an image-to-image retrieval problem. We evaluate the efficacy of our proposed approach on TU-Berlin, and Sketchy database in both standard ZSL and generalized ZSL setting. The proposed method yields a significant improvement in standard ZSL as well as in a more challenging generalized ZSL setting (GZSL) for SBIR.

[1]  Hema A. Murthy,et al.  A Generative Model for Zero Shot Learning Using Conditional Variational Autoencoders , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  Xiaochun Cao,et al.  SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Anurag Mittal,et al.  A Zero-Shot Framework for Sketch-based Image Retrieval , 2018, ECCV.

[5]  Fang Wang,et al.  Sketch-based 3D shape retrieval using Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Honggang Zhang,et al.  Sketch-based image retrieval via Siamese convolutional neural network , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[7]  Anurag Mittal,et al.  Adversarial Joint-Distribution Learning for Novel Class Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[8]  Yue Gao,et al.  Synthesizing Samples for Zero-shot Learning , 2017, IJCAI.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[11]  Xiaochun Cao,et al.  SketchNet: Sketch Classification with Web Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Josep Lladós,et al.  Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Piyush Rai,et al.  A Generative Approach to Zero-Shot and Few-Shot Action Recognition , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Kai Fan,et al.  Zero-Shot Learning via Class-Conditioned Deep Generative Models , 2017, AAAI.

[15]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[16]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[17]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[19]  Piyush Rai,et al.  Generalized Zero-Shot Learning via Synthesized Examples , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[22]  Zeynep Akata,et al.  Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  James Hays,et al.  The sketchy database , 2016, ACM Trans. Graph..

[25]  Ling Shao,et al.  Zero-Shot Sketch-Image Hashing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Bernt Schiele,et al.  Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Ling Shao,et al.  Deep Sketch Hashing: Fast Free-Hand Sketch-Based Image Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[29]  Feng Liu,et al.  Sketch Me That Shoe , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Piyush Rai,et al.  A Simple Exponential Family Framework for Zero-Shot Learning , 2017, ECML/PKDD.

[32]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[33]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[34]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[35]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[36]  Xun Xu,et al.  Transductive Zero-Shot Action Recognition by Word-Vector Embedding , 2015, International Journal of Computer Vision.

[37]  Bernt Schiele,et al.  Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Georgiana Dinu,et al.  Improving zero-shot learning by mitigating the hubness problem , 2014, ICLR.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Liqing Zhang,et al.  Edgel index for large-scale sketch-based image search , 2011, CVPR 2011.

[41]  Tao Xiang,et al.  Sketch-a-Net: A Deep Neural Network that Beats Humans , 2017, International Journal of Computer Vision.

[42]  Anurag Mittal,et al.  Similarity-Invariant Sketch-Based Image Retrieval in Large Databases , 2014, ECCV.

[43]  Piyush Rai,et al.  Generative Model for Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[44]  Piyush Rai,et al.  A Meta-Learning Framework for Generalized Zero-Shot Learning , 2019, ArXiv.