Sketchformer: Transformer-Based Representation for Sketched Structure

Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i.e. as a sequence of strokes. Sketchformer effectively addresses multiple tasks: sketch classification, sketch based image retrieval (SBIR), and the reconstruction and interpolation of sketches. We report several variants exploring continuous and tokenized input representations, and contrast their performance. Our learned embedding, driven by a dictionary learning tokenization scheme, yields state of the art performance in classification and image retrieval tasks, when compared against baseline representations driven by LSTM sequence to sequence architectures: SketchRNN and derivatives. We show that sketch reconstruction and interpolation are improved significantly by the Sketchformer embedding for complex sketches with longer stroke sequences.

[1]  Xiaochun Cao,et al.  SketchNet: Sketch Classification with Web Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  John P. Collomosse,et al.  Scalable Sketch-Based Image Retrieval Using Color Gradient Features , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[3]  John P. Collomosse,et al.  Compact descriptors for sketch-based image retrieval using a triplet loss convolutional neural network , 2017, Comput. Vis. Image Underst..

[4]  Jun Guo,et al.  SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Stéphane Dupont,et al.  Quadruplet Networks for Sketch-Based Image Retrieval , 2017, ICMR.

[6]  Omar Seddati,et al.  DeepSketch 2: Deep convolutional neural networks for partial sketch recognition , 2016, 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  Douglas Eck,et al.  A Neural Representation of Sketch Drawings , 2017, ICLR.

[8]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[10]  Leo Sampaio Ferraz Ribeiro,et al.  Sketching out the details: Sketch-based image retrieval using convolutional neural networks with multi-stage regression , 2018, Comput. Graph..

[11]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[12]  Tao Xiang,et al.  Learning Deep Sketch Abstraction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  David H. Douglas,et al.  ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Rui Hu,et al.  A performance evaluation of gradient field HOG descriptor for sketch based image retrieval , 2013, Comput. Vis. Image Underst..

[16]  James Hays,et al.  The sketchy database , 2016, ACM Trans. Graph..

[17]  Feng Liu,et al.  Sketch Me That Shoe , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Xuejin Chen,et al.  Sketchpointnet: A Compact Network for Robust Sketch Recognition , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[19]  Ondrej Chum,et al.  CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[20]  Liqing Zhang,et al.  MindFinder: interactive sketch-based image search on millions of images , 2010, ACM Multimedia.

[21]  Dustin Tran,et al.  Image Transformer , 2018, ICML.

[22]  Hailin Jin,et al.  BAM! The Behance Artistic Media Dataset for Recognition Beyond Photography , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Honggang Zhang,et al.  Sketch-based image retrieval via Siamese convolutional neural network , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[24]  Josep Lladós,et al.  Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Taesung Park,et al.  GauGAN: semantic image synthesis with spatially adaptive normalization , 2019, ACM SIGGRAPH 2019 Real-Time Live!.

[26]  John P. Collomosse,et al.  Generalisation and Sharing in Triplet Convnets for Sketch based Visual Search , 2016, ArXiv.

[27]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[28]  Tinne Tuytelaars,et al.  Sketch classification and classification-driven analysis using Fisher vectors , 2014, ACM Trans. Graph..

[29]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[30]  Rui Hu,et al.  Markov random fields for sketch based video retrieval , 2013, ICMR '13.

[31]  Yongyi Gong,et al.  Sketch-Based Shape Retrieval via Multi-view Attention and Generalized Similarity , 2018, 2018 7th International Conference on Digital Home (ICDH).

[32]  Tao Xiang,et al.  Generalising Fine-Grained Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[34]  Rui Hu,et al.  A bag-of-regions approach to sketch-based image retrieval , 2011, 2011 18th IEEE International Conference on Image Processing.

[35]  Yang Song,et al.  Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Tao Xiang,et al.  Learning to Sketch with Shortcut Cycle Consistency , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[38]  Hailin Jin,et al.  Sketching with Style: Visual Search with Sketches and Aesthetic Context , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Hailin Jin,et al.  LiveSketch: Query Perturbations for Guided Sketch-Based Visual Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Lei Li,et al.  Sketch-R2CNN: An Attentive Network for Vector Sketch Recognition , 2018, ArXiv.

[42]  Rui Hu,et al.  Motion-sketch Based Video Retrieval Using a Trellis Levenshtein Distance , 2010, 2010 20th International Conference on Pattern Recognition.