Investigating Capsule Network and Semantic Feature on Hyperplanes for Text Classification

As an essential component of natural language processing, text classification relies on deep learning in recent years. Various neural networks are designed for text classification on the basis of word embedding. However, polysemy is a fundamental feature of the natural language, which brings challenges to text classification. One polysemic word contains more than one sense, while the word embedding procedure conflates different senses of a polysemic word into a single vector. Extracting the distinct representation for the specific sense could thus lead to fine-grained models with strong generalization ability. It has been demonstrated that multiple senses of a word actually reside in linear superposition within the word embedding so that specific senses can be extracted from the original word embedding. Therefore, we propose to use capsule networks to construct the vectorized representation of semantics and utilize hyperplanes to decompose each capsule to acquire the specific senses. A novel dynamic routing mechanism named ‘routing-on-hyperplane’ will select the proper sense for the downstream classification task. Our model is evaluated on 6 different datasets, and the experimental results show that our model is capable of extracting more discriminative semantic features and yields a significant performance gain compared to other baseline methods.

[1]  Rui Zhang,et al.  Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents , 2016, NAACL.

[2]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[3]  Geoffrey E. Hinton,et al.  Matrix capsules with EM routing , 2018, ICLR.

[4]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[5]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[6]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[7]  Daniel Jurafsky,et al.  Do Multi-Sense Embeddings Improve Natural Language Understanding? , 2015, EMNLP.

[8]  Yaohui Jin,et al.  MCapsNet: Capsule Network for Text with Multi-Task Learning , 2018, EMNLP.

[9]  J. R. Firth,et al.  Studies in Linguistic Analysis. , 1974 .

[10]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11]  Yann LeCun,et al.  Very Deep Convolutional Networks for Text Classification , 2016, EACL.

[12]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[13]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[14]  Philip S. Yu,et al.  Zero-shot User Intent Detection via Capsule Neural Networks , 2018, EMNLP.

[15]  Yang Zhang,et al.  Adaptive Learning of Local Semantic and Global Structure Representations for Text Classification , 2018, COLING.

[16]  Sanjeev Arora,et al.  Linear Algebraic Structure of Word Senses, with Applications to Polysemy , 2016, TACL.

[17]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[18]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[21]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[22]  Charu C. Aggarwal,et al.  A Survey of Text Classification Algorithms , 2012, Mining Text Data.

[23]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[24]  Tong Zhang,et al.  Deep Pyramid Convolutional Neural Networks for Text Categorization , 2017, ACL.

[25]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[26]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[27]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[28]  Min Yang,et al.  Investigating Capsule Networks with Dynamic Routing for Text Classification , 2018, EMNLP.