AC-BLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification

Recently deeplearning models have been shown to be capable of making remarkable performance in sentences and documents classification tasks. In this work, we propose a novel framework called AC-BLSTM for modeling sentences and documents, which combines the asymmetric convolution neural network (ACNN) with the Bidirectional Long Short-Term Memory network (BLSTM). Experiment results demonstrate that our model achieves state-of-the-art results on five tasks, including sentiment analysis, question type classification, and subjectivity classification. In order to further improve the performance of AC-BLSTM, we propose a semi-supervised learning framework called G-AC-BLSTM for text classification by combining the generative model with AC-BLSTM.

[1]  J. Schilperoord,et al.  Linguistics , 1999 .

[2]  Zhiyuan Liu,et al.  A C-LSTM Neural Network for Text Classification , 2015, ArXiv.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Ye Zhang,et al.  MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification , 2016, NAACL.

[5]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[6]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[7]  Takahiro Watanabe,et al.  Document Analysis and Recognition , 1999, Communications in Computer and Information Science.

[8]  Alexandros Potamianos,et al.  Structural Attention Neural Networks for improved sentiment analysis , 2017, EACL.

[9]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[18]  Tao Chen,et al.  Learning User and Product Distributed Representations Using a Sequence Model for Sentiment Analysis , 2016, IEEE Computational Intelligence Magazine.

[19]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Regina Barzilay,et al.  Molding CNNs for text: non-linear, non-consecutive convolutions , 2015, EMNLP.

[24]  Kezhi Mao,et al.  Topic-Aware Deep Compositional Models for Sentence Classification , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[25]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[26]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[27]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[29]  Jürgen Schmidhuber,et al.  Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition , 2005, ICANN.

[30]  Rui Zhang,et al.  Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents , 2016, NAACL.

[31]  Francisco Casacuberta,et al.  A Bidirectional Recurrent Neural Language Model for Machine Translation , 2015, Proces. del Leng. Natural.

[32]  Zhiyuan Liu,et al.  Neural Sentiment Classification with User and Product Attention , 2016, EMNLP.

[33]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[34]  Peter Kulchyski and , 2015 .

[35]  Augustus Odena,et al.  Semi-Supervised Learning with Generative Adversarial Networks , 2016, ArXiv.

[36]  Ting Liu,et al.  Learning Semantic Representations of Users and Products for Document Level Sentiment Classification , 2015, ACL.

[37]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[38]  Marcus Liwicki,et al.  A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks , 2007 .

[39]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.