Deep Sketch Hashing: Fast Free-Hand Sketch-Based Image Retrieval

Free-hand sketch-based image retrieval (SBIR) is a specific cross-view retrieval task, in which queries are abstract and ambiguous sketches while the retrieval database is formed with natural images. Work in this area mainly focuses on extracting representative and shared features for sketches and natural images. However, these can neither cope well with the geometric distortion between sketches and images nor be feasible for large-scale SBIR due to the heavy continuous-valued distance computation. In this paper, we speed up SBIR by introducing a novel binary coding method, named Deep Sketch Hashing (DSH), where a semi-heterogeneous deep architecture is proposed and incorporated into an end-to-end binary coding framework. Specifically, three convolutional neural networks are utilized to encode free-hand sketches, natural images and, especially, the auxiliary sketch-tokens which are adopted as bridges to mitigate the sketch-image geometric distortion. The learned DSH codes can effectively capture the cross-view similarities as well as the intrinsic semantic correlations between different categories. To the best of our knowledge, DSH is the first hashing work specifically designed for category-level SBIR with an end-to-end deep architecture. The proposed DSH is comprehensively evaluated on two large-scale datasets of TU-Berlin Extension and Sketchy, and the experiments consistently show DSHs superior SBIR accuracies over several state-of-the-art methods, while achieving significantly reduced retrieval time and memory footprint.

[1]  Marc Alexa,et al.  Sketch-Based Image Retrieval: Benchmark and Bag-of-Features Descriptors , 2011, IEEE Transactions on Visualization and Computer Graphics.

[2]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yi Zhen,et al.  Co-Regularized Hashing for Multimodal Data , 2012, NIPS.

[4]  Ebroul Izquierdo,et al.  Large Scale Sketch Based Image Retrieval Using Patch Hashing , 2012, ISVC.

[5]  Xiaochun Cao,et al.  SketchNet: Sketch Classification with Web Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Honggang Zhang,et al.  Fine-grained sketch-based image retrieval: The role of part-aware attributes , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  Jiwen Lu,et al.  Learning Compact Binary Face Descriptor for Face Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Anurag Mittal,et al.  Similarity-Invariant Sketch-Based Image Retrieval in Large Databases , 2014, ECCV.

[10]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Kiyoharu Aizawa,et al.  Sketch2Manga: Sketch-based manga retrieval , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[12]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[13]  Ling Shao,et al.  Projection Bank: From High-Dimensional Data to Medium-Length Binary Codes , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[15]  Yuxin Peng,et al.  Cross-View Feature Learning for Scalable Social Image Analysis , 2014, AAAI.

[16]  Xiaochun Cao,et al.  SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Changhu Wang,et al.  Indexing billions of images for sketch-based retrieval , 2013, ACM Multimedia.

[19]  Venkatesh Saligrama,et al.  Efficient Training of Very Deep Neural Networks for Supervised Hashing , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Rui Hu,et al.  A bag-of-regions approach to sketch-based image retrieval , 2011, 2011 18th IEEE International Conference on Image Processing.

[21]  Wei Liu,et al.  Discrete Graph Hashing , 2014, NIPS.

[22]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[23]  Liqing Zhang,et al.  Edgel index for large-scale sketch-based image search , 2011, CVPR 2011.

[24]  Ryutarou Ohbuchi,et al.  Hashing Cross-Modal Manifold for Scalable Sketch-Based 3D Model Retrieval , 2014, 2014 2nd International Conference on 3D Vision.

[25]  Ling Shao,et al.  Sequential Discrete Hashing for Scalable Cross-Modality Similarity Retrieval , 2017, IEEE Transactions on Image Processing.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Ling Shao,et al.  Sequential Compact Code Learning for Unsupervised Image Hashing , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Jianmin Wang,et al.  Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Larry S. Davis,et al.  Multi-Modal Image Retrieval for Complex Queries using Small Codes , 2014, ICMR.

[30]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[31]  Liqing Zhang,et al.  MindFinder: interactive sketch-based image search on millions of images , 2010, ACM Multimedia.

[32]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[33]  Yang Yang,et al.  A Fast Optimization Method for General Binary Code Learning , 2016, IEEE Transactions on Image Processing.

[34]  Hamid R. Rabiee,et al.  MDL-CW: A Multimodal Deep Learning Framework with CrossWeights , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[36]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[38]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[39]  Marc Alexa,et al.  An evaluation of descriptors for large-scale image retrieval from sketched feature lines , 2010, Comput. Graph..

[40]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[41]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[42]  Honggang Zhang,et al.  Sketch-based image retrieval via Siamese convolutional neural network , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[43]  Tao Xiang,et al.  Sketch-a-Net that Beats Humans , 2015, BMVC.

[44]  Winston H. Hsu,et al.  Sketch-based image retrieval on mobile devices using compact hash bits , 2012, ACM Multimedia.

[45]  Ling Shao,et al.  Multiview Alignment Hashing for Efficient Image Search , 2015, IEEE Transactions on Image Processing.

[46]  Rui Hu,et al.  Gradient field descriptor for sketch based retrieval and localization , 2010, 2010 IEEE International Conference on Image Processing.

[47]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[48]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[49]  Jiwen Lu,et al.  Simultaneous Local Binary Feature Learning and Encoding for Face Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Martin Guha,et al.  Encyclopedia of Statistics in Behavioral Science , 2006 .

[51]  James Hays,et al.  The sketchy database , 2016, ACM Trans. Graph..

[52]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[53]  Jianmin Wang,et al.  Deep Hashing Network for Efficient Similarity Retrieval , 2016, AAAI.

[54]  Feng Liu,et al.  Sketch Me That Shoe , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Joseph J. Lim,et al.  Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[57]  Philip S. Yu,et al.  Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.

[58]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[59]  Wei Liu,et al.  Learning Binary Codes for Maximum Inner Product Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[60]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[61]  John P. Collomosse,et al.  ReEnact: Sketch based Choreographic Design from Archival Dance Footage , 2014, ICMR.

[62]  Rui Hu,et al.  A performance evaluation of gradient field HOG descriptor for sketch based image retrieval , 2013, Comput. Vis. Image Underst..

[63]  Jose M. Saavedra,et al.  Sketch based image retrieval using a soft computation of the histogram of edge local orientations (S-HELO) , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[64]  Jose M. Saavedra,et al.  Sketch based Image Retrieval using Learned KeyShapes (LKS) , 2015, BMVC.

[65]  Fang Wang,et al.  Sketch-based 3D shape retrieval using Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Liqing Zhang,et al.  Sketch-based image retrieval on a large scale database , 2012, ACM Multimedia.

[67]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Tao Mei,et al.  Deep Semantic-Preserving and Ranking-Based Hashing for Image Retrieval , 2016, IJCAI.

[69]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[70]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Jianmin Wang,et al.  Correlation Hashing Network for Efficient Cross-Modal Retrieval , 2016, BMVC.

[72]  Wu-Jun Li,et al.  Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Qiuping Xu Canonical correlation Analysis , 2014 .

[74]  Benjamin Bustos,et al.  An Improved Histogram of Edge Local Orientations for Sketch-Based Image Retrieval , 2010, DAGM-Symposium.