Deep encrypted text categorization

Long short-term memory (LSTM) is a significant approach to capture the long-range temporal context in sequences of arbitrary length. This had shown astonishing performance in sentence and document modeling. To leverage this, we use LSTM network to the encrypted text categorization at character and word level of texts. These texts are transformed in to dense word-vectors by using bag-of-words embedding. Dense word vectors are fed in to recurrent layers to capture the contextual information and followed by dense and activation layer with nonlinear activation function such as softmax for classification. The optimal network architecture has found by conducting various experiments with varying network parameters and network structures. All the experiments are run up to 1000 epochs with learning rate in the range [0.01-0.5]. Most of the LSTM network structures substantially performed well in 5-fold cross-validation. Based on the 5-fold cross-validation results, we claim that the character level inputs are more efficient in dealing with the encrypted texts in comparison to word level, due to the fact that character level input keeps more information from low-level textual representations. Character level based LSTM models achieved highest accuracy as 0.99 and the word level achieved highest accuracy as 0.94 in the classification settings of 5-fold cross validation using LSTM networks. On the real-world test data of CDMC 2016 e-News categorization task, word level LSTM models attained its highest accuracy as 0.43.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[3]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[6]  Zhiyuan Liu,et al.  A C-LSTM Neural Network for Text Classification , 2015, ArXiv.

[7]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[8]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[9]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[10]  Thomas Hofmann,et al.  Text categorization by boosting automatically extracted concepts , 2003, SIGIR.

[11]  Razvan Pascanu,et al.  Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.

[12]  David D. Lewis,et al.  An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[13]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[14]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[15]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[16]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[17]  Jun Wang,et al.  Character-level Convolutional Network for Text Classification Applied to Chinese Corpus , 2016, ArXiv.

[18]  Matt Post,et al.  Explicit and Implicit Syntactic Features for Text Classification , 2013, ACL.

[19]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[20]  Shi Bing,et al.  Inductive learning algorithms and representations for text categorization , 2006 .

[21]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[23]  Jürgen Schmidhuber,et al.  Learning to forget: continual prediction with LSTM , 1999 .

[24]  R. Moazzezi,et al.  Change-based population coding , 2011 .

[25]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[26]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[27]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[28]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[29]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[30]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.