Judging a Book By its Cover

Book covers communicate information to potential readers, but can that same information be learned by computers? We propose using a deep Convolutional Neural Network (CNN) to predict the genre of a book based on the visual clues provided by its cover. The purpose of this research is to investigate whether relationships between books and their covers can be learned. However, determining the genre of a book is a difficult task because covers can be ambiguous and genres can be overarching. Despite this, we show that a CNN can extract features and learn underlying design rules set by the designer to define a genre. Using machine learning, we can bring the large amount of resources available to the book cover design process. In addition, we present a new challenging dataset that can be used for many pattern recognition tasks.

[1]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[2]  Stan J. Liebowitz,et al.  Record Sales, MP3 downloads and the Annihilation Hypothesis , 2003 .

[3]  M. Spence Job Market Signaling , 1973 .

[4]  Konstantinos G. Derpanis,et al.  Evaluation of deep convolutional nets for document image classification and retrieval , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[5]  Aidan Finn,et al.  Learning to classify documents according to genre , 2006, J. Assoc. Inf. Sci. Technol..

[6]  Marcus Liwicki,et al.  Deepdocclassifier: Document classification with deep Convolutional Neural Network , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[7]  Yi Li,et al.  Convolutional Neural Networks for Document Image Classification , 2014, 2014 22nd International Conference on Pattern Recognition.

[8]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[9]  David Pye,et al.  Content-based methods for the management of digital music , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[10]  Johanna Drucker,et al.  Graphic design history : a critical guide , 2009 .

[11]  Alan T. Sorensen Bestseller Lists and Product Variety , 2007 .

[12]  R. Gibbons Game theory for applied economists , 1992 .

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[15]  M. Wallach,et al.  Risk Taking: A Study in Cognition and Personality , 1965 .

[16]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17]  Irccyn,et al.  Tenth international workshop on frontiers in handwriting recognition , 2006 .

[18]  James Ze Wang,et al.  Studying Aesthetics in Photographic Images Using a Computational Approach , 2006, ECCV.

[19]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[20]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  V. Mitchell Consumer perceived risk: conceptualisations and models , 1999 .

[24]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[26]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[27]  Bonnie L. Webber,et al.  Squibs: Stable Classification of Text Genres , 2011, CL.

[28]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[29]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Ichiro Fujinaga,et al.  Automatic Genre Classification Using Large High-Level Musical Feature Sets , 2004, ISMIR.

[31]  Patrice Y. Simard,et al.  High Performance Convolutional Neural Networks for Document Processing , 2006 .

[32]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[35]  Donald F. Cox,et al.  Perceived Risk and Consumer Decision-Making—The Case of Telephone Shopping , 1964 .

[36]  Bryan Pardo,et al.  Classifying paintings by artistic genre: An analysis of features & classifiers , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[37]  Rahul Telang,et al.  DRAFT : Preliminary and Incomplete Comments Welcome Competing with Free : The Impact of Movie Broadcasts on DVD Sales and Internet Piracy , 2006 .

[38]  Robert W. Shoemaker,et al.  The Effects of Free Sample Promotions on Incremental Brand Sales , 2004 .

[39]  Ted Roselius Consumer Rankings of Risk Reduction Methods , 1971 .