SketchTransfer: A Challenging New Task for Exploring Detail-Invariance and the Abstractions Learned by Deep Networks

Deep networks have achieved excellent results in perceptual tasks, yet their ability to generalize to variations not seen during training has come under increasing scrutiny. In this work we focus on their ability to have invariance towards the presence or absence of details. For example, humans are able to watch cartoons, which are missing many visual details, without being explicitly trained to do so. As another example, 3D rendering software is a relatively recent development, yet people are able to understand such rendered scenes even though they are missing details (consider a film like Toy Story). The failure of ma- chine learning algorithms to do this indicates a significant gap in generalization between human abilities and the abilities of deep networks. We propose a dataset that will make it easier to study the detail-invariance problem concretely. We produce a concrete task for this: SketchTransfer, and we show that state-of-the-art domain transfer algorithms still struggle with this task. The state-of-the-art technique which achieves over 95% on MNIST → SVHN transfer only achieves 59% accuracy on the SketchTransfer task, which is much better than random (11% accuracy) but falls short of the 87% accuracy of a classifier trained directly on labeled sketches. This indicates that this task is approachable with today’s best methods but has substantial room for improvement.

[1]  Ioannis Mitliagkas,et al.  Manifold Mixup: Better Representations by Interpolating Hidden States , 2018, ICML.

[2]  Tolga Tasdizen,et al.  Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[3]  Reiichiro Nakano,et al.  Neural Painters: A learned differentiable constraint for generating brushstroke paintings , 2019, ArXiv.

[4]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[5]  R Devon Hjelm,et al.  On Adversarial Mixup Resynthesis , 2019, NeurIPS.

[6]  Ningyuan Zheng,et al.  StrokeNet: A Neural Painting Environment , 2018, ICLR.

[7]  Douglas Eck,et al.  A Neural Representation of Sketch Drawings , 2017, ICLR.

[8]  Bria Long,et al.  Developmental changes in the ability to draw distinctive features of object categories , 2019, CogSci.

[9]  James Hays,et al.  The sketchy database , 2016, ACM Trans. Graph..

[10]  Alex Lamb,et al.  KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[11]  Matthias Bethge,et al.  Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.

[12]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dacheng Tao,et al.  Self-Supervised Representation Learning by Rotation Feature Decoupling , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yoshua Bengio,et al.  Interpolation Consistency Training for Semi-Supervised Learning , 2019, IJCAI.

[15]  Yong Jae Lee,et al.  ShadowDraw: real-time user guidance for freehand drawing , 2011, ACM Trans. Graph..

[16]  Eric P. Xing,et al.  High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Sina Honari,et al.  Adversarial Mixup Resynthesizers , 2019, DGS@ICLR.

[18]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Human and Computer Vision , 2018, ArXiv.

[19]  Michael C. Frank,et al.  Developmental changes in the ability to draw distinctive features of object categories , 2019, Journal of Vision.

[20]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[21]  Sergey Levine,et al.  Recurrent Independent Mechanisms , 2019, ICLR.

[22]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[23]  Oriol Vinyals,et al.  Synthesizing Programs for Images using Reinforced Adversarial Learning , 2018, ICML.

[24]  Michael C. Frank,et al.  Drawings as a window into developmental changes in object representations , 2018, CogSci.

[25]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[26]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[27]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Yoshua Bengio,et al.  GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning , 2019, ArXiv.

[29]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[30]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[31]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[32]  Aaron J. Alford The Quick , 2008 .

[33]  Stefano Ermon,et al.  A DIRT-T Approach to Unsupervised Domain Adaptation , 2018, ICLR.

[34]  Yun Ma,et al.  Virtual Mixup Training for Unsupervised Domain Adaptation , 2019, ArXiv.

[35]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[36]  Alex Lamb,et al.  Deep Learning for Classical Japanese Literature , 2018, ArXiv.

[37]  Masashi Sugiyama,et al.  Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting , 2012, ICML.

[38]  Shuchang Zhou,et al.  Learning to Paint With Model-Based Deep Reinforcement Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).