Short Text Classification via Knowledge powered Attention with Similarity Matrix based CNN

Short text is becoming more and more popular on the web, such as Chat Message, SMS and Product Reviews. Accurately classifying short text is an important and challenging task. A number of studies have difficulties in addressing this problem because of the word ambiguity and data sparsity. To address this issue, we propose a knowledge powered attention with similarity matrix based convolutional neural network (KASM) model, which can compute comprehensive information by utilizing the knowledge and deep neural network. We use knowledge graph (KG) to enrich the semantic representation of short text, specially, the information of parent-entity is introduced in our model. Meanwhile, we consider the word interaction in the literal-level between short text and the representation of label, and utilize similarity matrix based convolutional neural network (CNN) to extract it. For the purpose of measuring the importance of knowledge, we introduce the attention mechanisms to choose the important information. Experimental results on five standard datasets show that our model significantly outperforms state-of-the-art methods.

[1]  Yann LeCun,et al.  Very Deep Convolutional Networks for Natural Language Processing , 2016, ArXiv.

[2]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[3]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[6]  Claire Cardie,et al.  39. Opinion mining and sentiment analysis , 2014 .

[7]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[8]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[9]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[10]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[11]  Harald Sack,et al.  TECNE: Knowledge Based Text Classification Using Network Embeddings , 2018, EKAW.

[12]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[13]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[14]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[15]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[16]  Jie Liu,et al.  Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN , 2018, ArXiv.

[17]  Wenpeng Yin,et al.  Comparative Study of CNN and RNN for Natural Language Processing , 2017, ArXiv.

[18]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[19]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[20]  Jin Wang,et al.  Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification , 2017, IJCAI.

[21]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[22]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[23]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[24]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[25]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[26]  T. Espenshade,et al.  An analysis of public opinion toward undocumented immigration , 1993 .

[27]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[28]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[29]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[30]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[31]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[32]  Zhiyuan Liu,et al.  Differentiating Concepts and Instances for Knowledge Graph Embedding , 2018, EMNLP.

[33]  Jindong Chen,et al.  CN-Probase: A Data-Driven Approach for Large-Scale Chinese Taxonomy Construction , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[34]  Andrei Popescu-Belis,et al.  Multilingual Hierarchical Attention Networks for Document Classification , 2017, IJCNLP.

[35]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[36]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[37]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[38]  Young-In Song,et al.  Finding question-answer pairs from online forums , 2008, SIGIR '08.

[39]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[40]  James P. Callan,et al.  Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding , 2017, WWW.

[41]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[42]  Zhiyuan Liu,et al.  OpenKE: An Open Toolkit for Knowledge Embedding , 2018, EMNLP.

[43]  Yu Hao,et al.  TransA: An Adaptive Approach for Knowledge Graph Embedding , 2015, ArXiv.

[44]  Yuan Luo,et al.  Clinical text classification with rule-based features and knowledge-guided convolutional neural networks , 2018, 2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W).

[45]  Matt Post,et al.  Explicit and Implicit Syntactic Features for Text Classification , 2013, ACL.

[46]  Donal Carbaugh On Dialogue Studies , 2013, Journal of Dialogue Studies.

[47]  Jindong Chen,et al.  Deep Short Text Classification with Knowledge Powered Attention , 2019, AAAI.

[48]  Kurt Hornik,et al.  Text Categorization in R: A Reduced N-Gram Approach , 2010, GfKl.

[49]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[50]  Hua Wu,et al.  An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge , 2017, ACL.