Japanese Text Classification by Character-level Deep ConvNets and Transfer Learning

Temporal (one-dimensional) Convolutional Neural Network (Temporal CNN, ConvNet) is an emergent technology for text understanding. The input for the ConvNets could be either a sequence of words or a sequence of characters. In the latter case there are no needs for natural language processing that depends on a language such as morphological analysis. Past studies showed that the character-level ConvNets worked well for news category classification and sentiment analysis / classification tasks in English and romanized Chinese text corpus. In this article we apply the character-level ConvNets to Japanese text understanding. We also attempt to reuse meaningful representations that are learned in the ConvNets from a large-scale dataset in the form of transfer learning, inspired by its success in the field of image recognition. As for the application to the news category classification and the sentiment analysis and classification tasks in Japanese text corpus, the ConvNets outperformed N-gram-based classifiers. In addition, our ConvNets transfer learning frameworks worked well for a task which is similar to one used for pre-training.

[1]  Alessandro Moschitti,et al.  Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.

[2]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[4]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[5]  Francesco Romani,et al.  Ranking a stream of news , 2005, WWW '05.

[6]  Alessandro Moschitti,et al.  UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification , 2015, *SEMEVAL.

[7]  Jitendra Malik,et al.  Analyzing the Performance of Multilayer Neural Networks for Object Recognition , 2014, ECCV.

[8]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[9]  Jure Leskovec,et al.  Inferring Networks of Substitutable and Complementary Products , 2015, KDD.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Yuji Matsumoto,et al.  Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.

[13]  Antonio Gulli,et al.  The anatomy of a news search engine , 2005, WWW '05.

[14]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[15]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[16]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[20]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[21]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[22]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[23]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[24]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[25]  Anton van den Hengel,et al.  Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[26]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[27]  Xiang Zhang,et al.  Text Understanding from Scratch , 2015, ArXiv.