Can We Teach Computers to Understand Art? Domain Adaptation for Enhancing Deep Networks Capacity to De-Abstract Art

Abstract Humans comprehend a natural scene at a single glance; painters and other visual artists, through their abstract representations, stressed this capacity to the limit. The performance of computer vision solutions matched that of humans in many problems of visual recognition. In this paper we address the problem of recognizing the genre (subject) in digitized paintings using Convolutional Neural Networks (CNN) as part of the more general dealing with abstract and/or artistic representation of scenes. Initially we establish the state of the art performance by training a CNN from scratch. In the next level of evaluation, we identify aspects that hinder the CNNs' recognition, such as artistic abstraction. Further, we test various domain adaptation methods that could enhance the subject recognition capabilities of the CNNs. The evaluation is performed on a database of 80,000 annotated digitized paintings, which is tentatively extended with artistic photographs, either original or stylized, in order to emulate artistic representations. Surprisingly, the most efficient domain adaptation is not the neural style transfer. Finally, the paper provides an experiment-based assessment of the abstraction level that CNNs are able to achieve.

[1]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  David G. Stork,et al.  Computer Vision and Image Analysis of Art , 2010 .

[3]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[4]  Thomas Brox,et al.  Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT , 2014, ArXiv.

[5]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[6]  Andrea Vedaldi,et al.  Texture Networks: Feed-forward Synthesis of Textures and Stylized Images , 2016, ICML.

[7]  Corneliu Florea,et al.  Domain Transfer for Delving into Deep Networks Capacity to De-Abstract Art , 2017, SCIA.

[8]  Babak Saleh,et al.  Large-scale Classification of Fine-Art Paintings: Learning The Right Metric on The Right Feature , 2015, ArXiv.

[9]  Kiyoshi Tanaka,et al.  Ceci n'est pas une pipe: A deep convolutional network for fine-art paintings classification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[10]  Adriana Kovashka,et al.  Seeing Behind the Camera: Identifying the Authorship of a Photograph , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Wojciech Matusik,et al.  Multi-scale image harmonization , 2010, SIGGRAPH 2010.

[12]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Mark W. Schmidt,et al.  Fast Patch-based Style Transfer of Arbitrary Style , 2016, ArXiv.

[15]  Björn Ommer,et al.  Morphological analysis for investigating artistic images , 2014, Image Vis. Comput..

[16]  Corneliu Florea,et al.  Efficient domain adaptation for painting theme recognition , 2017, 2017 International Symposium on Signals, Circuits and Systems (ISSCS).

[17]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[18]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[19]  Anna Bentkowska-Kafel,et al.  Computer Vision and Image Analysis of Art II , 2010 .

[20]  Mathieu Aubry,et al.  Painting-to-3D model alignment via discriminative visual elements , 2014, TOGS.

[21]  Fabian Gieseke,et al.  Artistic Movement Recognition by Boosted Fusion of Color Structure and Topographic Description , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[22]  Frédo Durand,et al.  Fast Local Laplacian Filters , 2014, ACM Trans. Graph..

[23]  Peng Hao,et al.  Transfer learning using computational intelligence: A survey , 2015, Knowl. Based Syst..

[24]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[25]  T. V. Sewards Neural structures and mechanisms involved in scene recognition: A review and interpretation , 2011, Neuropsychologia.

[26]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[27]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Erik G. Learned-Miller,et al.  Unsupervised Joint Alignment of Complex Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[30]  P Kamat,et al.  The art of detection. , 1998, Occupational health & safety.

[31]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[32]  Qi Wu,et al.  Beyond Photo-Domain Object Recognition: Benchmarks for the Cross-Depiction Problem , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[33]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Luis Herranz,et al.  Scene Recognition with CNNs: Objects, Scales and Dataset Bias , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[37]  Frédo Durand,et al.  Two-scale tone management for photographic look , 2006, ACM Trans. Graph..

[38]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[39]  Jiaying Liu,et al.  Demystifying Neural Style Transfer , 2017, IJCAI.

[40]  Siddharth Agarwal,et al.  Genre and Style Based Painting Classification , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[41]  Erik Reinhard,et al.  Color Transfer between Images , 2001, IEEE Computer Graphics and Applications.

[42]  Hongping Cai,et al.  Cross-depiction problem: Recognition and synthesis of photographs and artwork , 2015, Computational Visual Media.

[43]  Lior Wolf,et al.  Classification of Artistic Styles Using Binarized Features Derived from a Deep Neural Network , 2014, ECCV Workshops.

[44]  James J. DiCarlo,et al.  How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.

[45]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Ali Borji,et al.  Negative results in computer vision: A perspective , 2017, Image Vis. Comput..

[47]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[48]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[49]  Frédo Durand,et al.  Two-scale tone management for photographic look , 2006, SIGGRAPH 2006.

[50]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[51]  Corneliu Florea,et al.  Painting Scene Recognition Using Homogenous Shapes , 2013, ACIVS.

[52]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.