An Innovative Word Encoding Method For Text Classification Using Convolutional Neural Network

Text classification plays a vital role today especially with the intensive use of social networking media. Recently, different architectures of convolutional neural networks have been used for text classification in which one-hot vector, and word embedding methods are commonly used. This paper presents a new language independent word encoding method for text classification. The proposed model converts raw text data to low-level feature dimension with minimal or no preprocessing steps by using a new approach called binary unique number of word “BUNOW.” BUNOW allows each unique word to have an integer ID in a dictionary that is represented as a k-dimensional vector of its binary equivalent. The output vector of this encoding is fed into a convolutional neural network (CNN) model for classification. Moreover, the proposed model reduces the neural network parameters, allows faster computation with few network layers, where a word is atomic representation the document as in word level, and decrease memory consumption for character level representation. The provided CNN model is able to work with other languages or multi-lingual text without the need for any changes in the encoding method. The model outperforms the character level and very deep character level CNNs models in terms of accuracy, network parameters, and memory consumption; the results show total classification accuracy 91.99% and error 8.01% using AG’s News dataset compared to the state of art methods that have total classification accuracy 91.45% and error 8.55%, in addition to the reduction in input feature vector and neural network parameters by 62% and 34%, respectively.

[1]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[2]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[3]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[4]  Richard A. Harshman,et al.  Indexing by latent semantic indexing , 1990 .

[5]  Alessandro Moschitti,et al.  Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Kyunghyun Cho,et al.  Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers , 2016, ArXiv.

[8]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[9]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.

[10]  Adam Kilgarriff,et al.  of the European Chapter of the Association for Computational Linguistics , 2006 .

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[13]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[14]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[17]  Noah A. Smith,et al.  Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[18]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[19]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[20]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[21]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[22]  Hu Jianjun,et al.  Research on the Application of an Improved TFIDF Algorithm in Text Classification , 2013 .

[23]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[24]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[25]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[26]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.