Enabling My Robot To Play Pictionary: Recurrent Neural Networks For Sketch Recognition

Freehand sketching is an inherently sequential process. Yet, most approaches for hand-drawn sketch recognition either ignore this sequential aspect or exploit it in an ad-hoc manner. In our work, we propose a recurrent neural network architecture for sketch object recognition which exploits the long-term sequential and structural regularities in stroke data in a scalable manner. Specifically, we introduce a Gated Recurrent Unit based framework which leverages deep sketch features and weighted per-timestep loss to achieve state-of-the-art results on a large database of freehand object sketches across a large number of object categories. The inherently online nature of our framework is especially suited for on-the-fly recognition of objects as they are being drawn. Thus, our framework can enable interesting applications such as camera-equipped robots playing the popular party game Pictionary with human players and generating sparsified yet recognizable sketches of objects.

[1]  Kim Marriott,et al.  Intelligent diagramming in the electronic online classroom , 2009, 2009 2nd Conference on Human System Interactions.

[2]  T. Metin Sezgin,et al.  Sketch recognition by fusion of temporal and image-based features , 2011, Pattern Recognit..

[3]  K JainAnil,et al.  Matching Forensic Sketches to Mug Shot Photos , 2011 .

[4]  Tao Xiang,et al.  Sketch-a-Net that Beats Humans , 2015, BMVC.

[5]  Shaogang Gong,et al.  Sketch Recognition by Ensemble Matching of Structured Features , 2013, BMVC.

[6]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[7]  Colin Raffel,et al.  Lasagne: First release. , 2015 .

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[10]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[11]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[12]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[14]  Tinne Tuytelaars,et al.  Sketch classification and classification-driven analysis using Fisher vectors , 2014, ACM Trans. Graph..

[15]  Huosheng Xu,et al.  On-line sketch recognition for course of action diagrams , 2010, 2010 IEEE International Conference on Mechatronics and Automation.

[16]  Kenneth D. Forbus,et al.  Efficient Learning of Qualitative Descriptions for Sketch Recognition , 2006 .

[17]  Liqing Zhang,et al.  Query-adaptive shape topic mining for hand-drawn sketch recognition , 2012, ACM Multimedia.

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  Ravi Kiran Sarvadevabhatla,et al.  Eye of the Dragon: Exploring Discriminatively Minimalist Sketch-based Abstractions for Object Categories , 2015, ACM Multimedia.

[20]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[21]  Stéphane Dupont,et al.  DeepSketch: Deep convolutional neural networks for sketch recognition and similarity search , 2015, 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI).