Compositional Sketch Search

We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects. Sketch based image retrieval (SBIR) methods predominantly match queries containing a single, dominant object invariant to its position within an image. Our work exploits drawings as a concise and intuitive representation for specifying entire scene compositions. We train a convolutional neural network (CNN) to encode masked visual features from sketched objects, pooling these into a spatial descriptor encoding the spatial relationships and appearances of objects in the composition. Training the CNN backbone as a Siamese network under triplet loss yields a metric search embedding for measuring compositional similarity which may be efficiently leveraged for visual search by applying product quantization.

[1]  Jian Sun,et al.  Optimized Product Quantization for Approximate Nearest Neighbor Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[4]  Jose M. Saavedra,et al.  Sketch based Image Retrieval using Learned KeyShapes (LKS) , 2015, BMVC.

[5]  Liqing Zhang,et al.  MindFinder: interactive sketch-based image search on millions of images , 2010, ACM Multimedia.

[6]  John P. Collomosse,et al.  Compact descriptors for sketch-based image retrieval using a triplet loss convolutional neural network , 2017, Comput. Vis. Image Underst..

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Eugenio Di Sciascio,et al.  Content-Based Image Retrieval over the Web Using Query by Sketch and Relevance Feedback , 1999, VISUAL.

[9]  Jordi Pont-Tuset,et al.  The Open Images Dataset V4 , 2018, International Journal of Computer Vision.

[10]  Yu Qian,et al.  Storyboard sketches for Content Based Video Retrieval , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Rui Hu,et al.  A performance evaluation of gradient field HOG descriptor for sketch based image retrieval , 2013, Comput. Vis. Image Underst..

[12]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Marc Alexa,et al.  Sketch-Based Image Retrieval: Benchmark and Bag-of-Features Descriptors , 2011, IEEE Transactions on Visualization and Computer Graphics.

[14]  Honggang Zhang,et al.  Sketch-based image retrieval via Siamese convolutional neural network , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[15]  Rui Hu,et al.  Markov random fields for sketch based video retrieval , 2013, ICMR '13.

[16]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[17]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[18]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Hailin Jin,et al.  Sketching with Style: Visual Search with Sketches and Aesthetic Context , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Yongyi Gong,et al.  Sketch-Based Shape Retrieval via Multi-view Attention and Generalized Similarity , 2018, 2018 7th International Conference on Digital Home (ICDH).

[22]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Hailin Jin,et al.  Spatial-Semantic Image Search by Visual Feature Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Fang Liu,et al.  SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches , 2020, ECCV.

[25]  Leo Sampaio Ferraz Ribeiro,et al.  Sketching out the details: Sketch-based image retrieval using convolutional neural networks with multi-stage regression , 2018, Comput. Graph..

[26]  John Collomosse,et al.  Sketchformer: Transformer-Based Representation for Sketched Structure , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Fang Wang,et al.  Sketch-based 3D shape retrieval using Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Feng Liu,et al.  Sketch Me That Shoe , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).