Classification of Book Genres By Cover and Title

This paper discusses the classification of books purely based on cover image and title, without prior knowledge or context of author and origin. Several methods were implemented to assess the ability to distinguish books based on only these two characteristics. First we used a color-based distribution approach. Then we implemented transfer learning with convolutional neural networks on the cover image along with natural language processing on the title text. We found that image and text modalities yielded similar accuracy which indicate that we have reached a certain threshold in distinguishing between the genres that we have defined. This was confirmed by the accuracy being quite close to the human oracle accuracy.

[1]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[2]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[4]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[5]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[6]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[7]  Bryan Pardo,et al.  Classifying paintings by artistic genre: An analysis of features & classifiers , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[8]  Stefan Winkler,et al.  Image complexity and spatial information , 2013, 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX).

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.