On the Performance of GoogLeNet and AlexNet Applied to Sketches

This work provides a study on how Convolutional Neural Networks, trained to identify objects primarily in photos, perform when applied to more abstract representations of the same objects. Our main goal is to better understand the generalization abilities of these networks and their learned inner representations. We show that both GoogLeNet and AlexNet networks are largely unable to recognize abstract sketches that are easily recognizable by humans. Moreover, we show that the measured efficacy vary considerably across different classes and we discuss possible reasons for this.

[1]  Lina J. Karam,et al.  A ROBUST IMAGE SHARPNESS METRIC BASED ON KURTOSIS MEASUREMENT OF WAVELET COEFFICIENTS , 2005 .

[2]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[3]  Ronan Collobert,et al.  Recurrent Convolutional Neural Networks for Scene Parsing , 2013, ArXiv.

[4]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[5]  Yongxin Yang,et al.  Deep Neural Networks for Sketch Recognition , 2015, ArXiv.

[6]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[12]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.