The sketchy database

We present the Sketchy database, the first large-scale collection of sketch-photo pairs. We ask crowd workers to sketch particular photographic objects sampled from 125 categories and acquire 75,471 sketches of 12,500 objects. The Sketchy database gives us fine-grained associations between particular photos and sketches, and we use this to train cross-domain convolutional networks which embed sketches and photographs in a common feature space. We use our database as a benchmark for fine-grained retrieval and show that our learned representation significantly outperforms both hand-crafted features as well as deep features trained for sketch or photo classification. Beyond image retrieval, we believe the Sketchy database opens up new opportunities for sketch and image understanding and synthesis.

[1]  Toshikazu Kato,et al.  A sketch retrieval method for full color image database-query by visual example , 1992, [1992] Proceedings. 11th IAPR International Conference on Pattern Recognition.

[2]  David Salesin,et al.  Fast multiresolution image querying , 1995, SIGGRAPH.

[3]  Alberto Del Bimbo,et al.  Visual Image Retrieval by Elastic Matching of User Sketches , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Stan Sclaroff,et al.  Deformable prototypes for encoding shape categories in image databases , 1995, Pattern Recognit..

[5]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Daeyeol Lee,et al.  What are the units of visual short-term memory, objects or spatial locations? , 2001, Perception & psychophysics.

[7]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[8]  N. Kanwisher,et al.  PSYCHOLOGICAL SCIENCE Research Article Visual Recognition As Soon as You Know It Is There, You Know What It Is , 2022 .

[9]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Adam Finkelstein,et al.  Where do people draw lines? , 2008, ACM Trans. Graph..

[13]  Aude Oliva,et al.  Visual long-term memory has a massive storage capacity for object details , 2008, Proceedings of the National Academy of Sciences.

[14]  Adam Finkelstein,et al.  Where do people draw lines , 2008, SIGGRAPH 2008.

[15]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[16]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[17]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Marc Alexa,et al.  An evaluation of descriptors for large-scale image retrieval from sketched feature lines , 2010, Comput. Graph..

[19]  Alexei A. Efros,et al.  Data-driven visual similarity for cross-domain image matching , 2011, ACM Trans. Graph..

[20]  Liqing Zhang,et al.  Edgel index for large-scale sketch-based image search , 2011, CVPR 2011.

[21]  Alexei A. Efros,et al.  Data-driven visual similarity for cross-domain image matching , 2011, ACM Trans. Graph..

[22]  Marc Alexa,et al.  Sketch-Based Image Retrieval: Benchmark and Bag-of-Features Descriptors , 2011, IEEE Transactions on Visualization and Computer Graphics.

[23]  Marc Alexa,et al.  Photosketcher: Interactive Sketch-Based Image Synthesis , 2011, IEEE Computer Graphics and Applications.

[24]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Marc Alexa,et al.  Sketch-based shape retrieval , 2012, ACM Trans. Graph..

[27]  Xiaochun Cao,et al.  SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor , 2013, 2013 IEEE International Conference on Computer Vision.

[28]  Timothy F. Brady,et al.  Visual Long-Term Memory Has the Same Limit on Fidelity as Visual Working Memory , 2013, Psychological science.

[29]  Ariel Shamir,et al.  Style and abstraction in portrait sketching , 2013, ACM Trans. Graph..

[30]  Andrew C. Gallagher,et al.  Which Edges Matter? , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[31]  Shi-Min Hu,et al.  PoseShop: Human Image Database Construction and Personalized Content Synthesis , 2013, IEEE Transactions on Visualization and Computer Graphics.

[32]  Adrien Treuille,et al.  Real-time drawing assistance through crowdsourcing , 2013, HCOMP.

[33]  Rui Hu,et al.  A performance evaluation of gradient field HOG descriptor for sketch based image retrieval , 2013, Comput. Vis. Image Underst..

[34]  Shaogang Gong,et al.  Fine-grained sketch-based image retrieval by matching deformable part models , 2014 .

[35]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[36]  Tinne Tuytelaars,et al.  Sketch classification and classification-driven analysis using Fisher vectors , 2014, ACM Trans. Graph..

[37]  M. Nieuwenstein,et al.  Beyond a mask and against the bottleneck: retroactive dual-task interference during working memory consolidation of a masked visual target. , 2014, Journal of experimental psychology. General.

[38]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[39]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[40]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Krista A. Ehinger,et al.  SUN Database: Exploring a Large Collection of Scene Categories , 2014, International Journal of Computer Vision.

[42]  Holger Winnemöller,et al.  PortraitSketch: face sketching assistance for novices , 2014, UIST.

[43]  Timothy M. Hospedales,et al.  Fine-grained sketch-based image retrieval by matching deformable part models , 2018 .

[44]  Yong Jae Lee,et al.  AverageExplorer: interactive exploration and alignment of visual data collections , 2014, ACM Trans. Graph..

[45]  C. Lawrence Zitnick,et al.  Zero-Shot Learning via Visual Abstraction , 2014, ECCV.

[46]  Yang Song,et al.  Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Yong Jae Lee,et al.  FlowWeb: Joint image set alignment by weaving consistent, pixel-wise correspondences , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Leonidas J. Guibas,et al.  Joint embeddings of shapes and images via CNN image purification , 2015, ACM Trans. Graph..

[50]  Rahul Sukthankar,et al.  MatchNet: Unifying feature and metric learning for patch-based matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Serge J. Belongie,et al.  Learning deep representations for ground-to-aerial geolocalization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Kavita Bala,et al.  Learning visual similarity for product design with convolutional neural networks , 2015, ACM Trans. Graph..

[54]  Fang Wang,et al.  Sketch-based 3D shape retrieval using Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Jose M. Saavedra,et al.  Sketch based Image Retrieval using Learned KeyShapes (LKS) , 2015, BMVC.

[56]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Tao Xiang,et al.  Sketch-a-Net that Beats Humans , 2015, BMVC.

[58]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[59]  Feng Liu,et al.  Sketch Me That Shoe , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).