Deep Short Text Classification with Knowledge Powered Attention

Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (CST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.

[1]  Li Li,et al.  Name Disambiguation Using Semi-supervised Topic Model , 2015, ICIC.

[2]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[3]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[4]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[5]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[6]  Jin Wang,et al.  Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification , 2017, IJCAI.

[7]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[8]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[9]  Haixun Wang,et al.  Understanding Short Texts , 2013, APWeb.

[10]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[11]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12]  Yu Zhou,et al.  A sequence level latent topic modeling method for sentiment analysis via CNN based Diversified Restrict Boltzmann Machine , 2016, 2016 International Conference on Machine Learning and Cybernetics (ICMLC).

[13]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[14]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[15]  Proceedings of the 27th ACM International Conference on Information and Knowledge Management , 2018 .

[16]  Xuanjing Huang,et al.  Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization , 2017, NLPCC.

[17]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[20]  Gang Wang,et al.  Understanding user's query intent with wikipedia , 2009, WWW '09.

[21]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[22]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[23]  Franck Dernoncourt,et al.  Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks , 2016, NAACL.

[24]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[25]  Zhoujun Li,et al.  Concept-based Short Text Classification and Ranking , 2014, CIKM.

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[28]  Hua Wu,et al.  An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge , 2017, ACL.

[29]  Matt Post,et al.  Explicit and Implicit Syntactic Features for Text Classification , 2013, ACL.

[30]  Paula Severi,et al.  Web Semantics: Science, Services and Agents on the World Wide Web , 2015 .