Incorporating knowledge into neural network for text representation

Abstract Text representations is a key task for many natural language processing applications such as document classification, ranking, sentimental analysis and so on. The goal of it is to numerically represent the unstructured text documents so that they can be computed mathematically. Most of the existing methods leverage the power of deep learning to produce a representation of text. However, these models do not consider about the problem that text itself is usually semantically ambiguous and reflects limited information. Due to this reason, it is necessary to seek help from external knowledge base to better understand text. In this paper, we propose a novel framework named Text Concept Vector which leverages both the neural network and the knowledge base to produce a high quality representation of text. Formally, a raw text is primarily conceptualized and represented by a set of concepts through a large taxonomy knowledge base. After that, a neural network is used to transform the conceptualized text into a vector form which encodes both the semantic information and the concept information of the original text. We test our framework on both the sentence level task and the document level task. The experimental results illustrate the effectiveness of our work.

[1]  Alexander J. Smola,et al.  Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS) , 2014, KDD.

[2]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[3]  Hao Wu,et al.  Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content , 2015, WWW.

[4]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[5]  M. Marelli,et al.  SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[6]  Tie-Yan Liu,et al.  Knowledge-Powered Deep Learning for Word Embedding , 2014, ECML/PKDD.

[7]  Yu Zhou,et al.  Learning representations from heterogeneous network for sentiment classification of product reviews , 2017, Knowl. Based Syst..

[8]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[10]  Jun Wang,et al.  Character-level Convolutional Network for Text Classification Applied to Chinese Corpus , 2016, ArXiv.

[11]  Claire Cardie,et al.  Deep Recursive Neural Networks for Compositionality in Language , 2014, NIPS.

[12]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[13]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[14]  Rui Zhang,et al.  Incorporating Knowledge Graph Embeddings into Topic Modeling , 2017, AAAI.

[15]  Xindong Wu,et al.  Computing term similarity by large probabilistic isA knowledge , 2013, CIKM.

[16]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[17]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[18]  Jun Zhang,et al.  A multi-level text representation model within background knowledge based on human cognitive process for big data analysis , 2013, 2013 IEEE 12th International Conference on Cognitive Informatics and Cognitive Computing.

[19]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[20]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[21]  Phil Blunsom,et al.  The Role of Syntax in Vector Space Models of Compositional Semantics , 2013, ACL.

[22]  Ming Zhou,et al.  Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis , 2014, AAAI.

[23]  Zhiyuan Liu,et al.  A C-LSTM Neural Network for Text Classification , 2015, ArXiv.

[24]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[25]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[26]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[27]  Vasudeva Varma,et al.  Doc2Sent2Vec: A Novel Two-Phase Approach for Learning Document Representation , 2016, SIGIR.

[28]  Qin Lu,et al.  Intersubjectivity and Sentiment: From Language to Knowledge , 2016, IJCAI.

[29]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[30]  Eduard H. Hovy,et al.  When Are Tree Structures Necessary for Deep Learning of Representations? , 2015, EMNLP.

[31]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[32]  Paolo Rosso,et al.  Language Variety Identification Using Distributed Representations of Words and Documents , 2015, CLEF.

[33]  Ting Liu,et al.  Learning Semantic Representations of Users and Products for Document Level Sentiment Classification , 2015, ACL.

[34]  Christopher D. Manning,et al.  Global Belief Recursive Neural Networks , 2014, NIPS.

[35]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[36]  Baogang Wei,et al.  Mining coherent topics in documents using word embeddings and large-scale text data , 2017, Eng. Appl. Artif. Intell..

[37]  Zhoujun Li,et al.  Concept-based Short Text Classification and Ranking , 2014, CIKM.

[38]  Chilin Shih,et al.  A Stochastic Finite-State Word-Segmentation Algorithm for Chinese , 1994, ACL.

[39]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[40]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[41]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[42]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.

[43]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[44]  Ndapandula Nakashole,et al.  Knowledge Distillation for Bilingual Dictionary Induction , 2017, EMNLP.

[45]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[46]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[47]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.