Embedding Logic Rules Into Recurrent Neural Networks

Incorporating prior knowledge into recurrent neural network (RNN) is of great importance for many natural language processing tasks. However, most of the prior knowledge is in the form of structured knowledge and is difficult to be exploited in the existing RNN framework. By extracting the logic rules from the structured knowledge and embedding the extracted logic rule into the RNN, this paper proposes an effective framework to incorporate the prior information in the RNN models. First, we demonstrate that commonly used prior knowledge could be decomposed into a set of logic rules, including the knowledge graph, social graph, and syntactic dependence. Second, we present a technique to embed a set of logic rules into the RNN by the way of feedback masks. Finally, we apply the proposed approach to the sentiment classification and named entity recognition task. The extensive experimental results verify the effectiveness of the embedding approach. The encouraging results suggest that the proposed approach has the potential for applications in other NLP tasks.

[1]  Gregory F. Cooper,et al.  NESTOR: A Computer-Based Medical Diagnostic Aid That Integrates Causal and Probabilistic Knowledge. , 1984 .

[2]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[3]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[4]  Xin Wang,et al.  Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory , 2015, ACL.

[5]  Sameer Singh,et al.  Low-Dimensional Embeddings of Logic , 2014, ACL 2014.

[6]  Sameer Singh,et al.  Injecting Logical Background Knowledge into Embeddings for Relation Extraction , 2015, NAACL.

[7]  Xuanjing Huang,et al.  Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification , 2016, EMNLP.

[8]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[9]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[10]  Jude Shavlik,et al.  An Approach to Combining Explanation-based and Neural Learning Algorithms , 1989 .

[11]  Artur S. d'Avila Garcez,et al.  The Connectionist Inductive Learning and Logic Programming System , 1999, Applied Intelligence.

[12]  Lars Backstrom,et al.  The Anatomy of the Facebook Social Graph , 2011, ArXiv.

[13]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[14]  Hongyu Guo,et al.  An Empirical Study on the Effect of Negation Words on Sentiment , 2014, ACL.

[15]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[18]  Andreas Vlachos,et al.  Dependency Recurrent Neural Language Models for Sentence Completion , 2015, ACL.

[19]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[20]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[21]  Alexander J. Smola,et al.  Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS) , 2014, KDD.

[22]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[23]  Henry E. Brady Causation and Explanation in Social Science , 2008 .

[24]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[25]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[27]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[28]  James Hammerton,et al.  Named Entity Recognition with Long Short-Term Memory , 2003, CoNLL.

[29]  Tomas Mikolov,et al.  RNNLM - Recurrent Neural Network Language Modeling Toolkit , 2011 .