Tackling Graphical NLP problems with Graph Recurrent Networks

How to properly model graphs is a long-existing and important problem in NLP area, where several popular types of graphs are knowledge graphs, semantic graphs and dependency graphs. Comparing with other data structures, such as sequences and trees, graphs are generally more powerful in representing complex correlations among entities. For example, a knowledge graph stores real-word entities (such as "Barack_Obama" and "U.S.") and their relations (such as "live_in" and "lead_by"). Properly encoding a knowledge graph is beneficial to user applications, such as question answering and knowledge discovery. Modeling graphs is also very challenging, probably because graphs usually contain massive and cyclic relations. Recent years have witnessed the success of deep learning, especially RNN-based models, on many NLP problems. Besides, RNNs and their variations have been extensively studied on several graph problems and showed preliminary successes. Despite the successes that have been achieved, RNN-based models suffer from several major drawbacks on graphs. First, they can only consume sequential data, thus linearization is required to serialize input graphs, resulting in the loss of important structural information. Second, the serialization results are usually very long, so it takes a long time for RNNs to encode them. In this thesis, we propose a novel graph neural network, named graph recurrent network (GRN). We study our GRN model on 4 very different tasks, such as machine reading comprehension, relation extraction and machine translation. Some take undirected graphs without edge labels, while the others have directed ones with edge labels. To consider these important differences, we gradually enhance our GRN model, such as further considering edge labels and adding an RNN decoder. Carefully designed experiments show the effectiveness of GRN on all these tasks.

[1]  Shuohang Wang,et al.  Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.

[2]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[3]  Qun Liu,et al.  A novel dependency-to-string model for statistical machine translation , 2011, EMNLP.

[4]  Giorgio Satta,et al.  Sequence-to-sequence Models for Cache Transition Systems , 2018, ACL.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  William W. Cohen,et al.  Quasar: Datasets for Question Answering by Search and Reading , 2017, ArXiv.

[7]  Yue Zhang,et al.  Neural Transition-based Syntactic Linearization , 2018, INLG.

[8]  Sebastian Riedel,et al.  Constructing Datasets for Multi-hop Reading Comprehension Across Documents , 2017, TACL.

[9]  Guodong Zhou,et al.  Modeling Source Syntax for Neural Machine Translation , 2017, ACL.

[10]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[11]  Jaime G. Carbonell,et al.  Generation from Abstract Meaning Representation using Tree Transducers , 2016, NAACL.

[12]  Daniel Gildea,et al.  Sense Embedding Learning for Word Sense Induction , 2016, *SEM@ACL.

[13]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[14]  Yue Zhang,et al.  End-to-End Neural Relation Extraction with Global Optimization , 2017, EMNLP.

[15]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[16]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[17]  Zhen Wang,et al.  Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension , 2018, ACL.

[18]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[19]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[20]  Roberto Carlini,et al.  FORGe at SemEval-2017 Task 9: Deep sentence generation based on a sequence of graph transducers , 2017, SemEval@ACL.

[21]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[22]  Yue Zhang,et al.  Exploring Graph-structured Passage Representation for Multi-hop Reading Comprehension with Graph Neural Networks , 2018, ArXiv.

[23]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[24]  Diego Marcheggiani,et al.  Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling , 2017, EMNLP.

[25]  Andreas Vlachos,et al.  Sheffield at SemEval-2017 Task 9: Transition-based language generation from AMR , 2017, SemEval@ACL.

[26]  Joyce Yue Chai,et al.  Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates , 2010, ACL.

[27]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[28]  Khalil Sima'an,et al.  Graph Convolutional Encoders for Syntax-aware Neural Machine Translation , 2017, EMNLP.

[29]  Chris Quirk,et al.  A Discriminative Model for Semantics-to-String Translation , 2015 .

[30]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[31]  Ruslan Salakhutdinov,et al.  Neural Models for Reasoning over Multiple Mentions Using Coreference , 2018, NAACL.

[32]  Dirk Weissenborn,et al.  Making Neural QA as Simple as Possible but not Simpler , 2017, CoNLL.

[33]  Junzhou Huang,et al.  Adaptive Sampling Towards Fast Graph Representation Learning , 2018, NeurIPS.

[34]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[35]  Wei Zhang,et al.  Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering , 2017, ICLR.

[36]  Yejin Choi,et al.  Neural AMR: Sequence-to-Sequence Models for Parsing and Generation , 2017, ACL.

[37]  Wei Zhang,et al.  R3: Reinforced Ranker-Reader for Open-Domain Question Answering , 2018, AAAI.

[38]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[39]  Alessandro Moschitti,et al.  Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction , 2013, ACL.

[40]  ChengXiang Zhai,et al.  A Systematic Exploration of the Feature Space for Relation Extraction , 2007, NAACL.

[41]  Nancy Chinchor,et al.  Overview of MUC-7 , 1998, MUC.

[42]  Sarthak Jain,et al.  Question Answering over Knowledge Base using Factual Memory Networks , 2016, NAACL.

[43]  Jacob Andreas,et al.  Semantics-Based Machine Translation with Hyperedge Replacement Grammars , 2012, COLING.

[44]  Quoc V. Le,et al.  Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[45]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[46]  Kevin Knight,et al.  Generating English from Abstract Meaning Representations , 2016, INLG.

[47]  Xiaochang Peng,et al.  A Synchronous Hyperedge Replacement Grammar based approach for AMR parsing , 2015, CoNLL.

[48]  Zhiyuan Liu,et al.  Denoising Distantly Supervised Open-Domain Question Answering , 2018, ACL.

[49]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[50]  Mark Johnson,et al.  AMR dependency parsing with a typed semantic algebra , 2018, ACL.

[51]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[52]  Shuicheng Yan,et al.  Semantic Object Parsing with Graph LSTM , 2016, ECCV.

[53]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[54]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[55]  Diego Marcheggiani,et al.  Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks , 2018, NAACL.

[56]  Yuji Matsumoto,et al.  Coreference based event-argument relation extraction on biomedical text , 2011, Semantic Mining in Biomedicine.

[57]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[58]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[59]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[60]  Jiebo Luo,et al.  Graph-based Neural Sentence Ordering , 2019, IJCAI.

[61]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[62]  Yue Zhang,et al.  Leveraging Context Information for Natural Question Generation , 2018, NAACL.

[63]  Jaime G. Carbonell,et al.  A Discriminative Graph-Based Parser for the Abstract Meaning Representation , 2014, ACL.

[64]  Qun Liu,et al.  Translation with Source Constituency and Dependency Trees , 2013, EMNLP.

[65]  Rongrong Ji,et al.  Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation , 2016, AAAI.

[66]  Yelong Shen,et al.  ReasoNet: Learning to Stop Reading in Machine Comprehension , 2016, CoCo@NIPS.

[67]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[68]  Yang Jin,et al.  Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE , 2005, ACL.

[69]  Bill Byrne,et al.  Syntactically Guided Neural Machine Translation , 2016, ACL.

[70]  Tiejun Zhao,et al.  Syntax-Directed Attention for Neural Machine Translation , 2017, AAAI.

[71]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[72]  Hongyu Guo,et al.  DAG-Structured Long Short-Term Memory for Semantic Compositionality , 2016, NAACL.

[73]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[74]  Rajarshi Das,et al.  A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset , 2018, QA@ACL.

[75]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[76]  Ding Liu,et al.  Semantic Role Features for Machine Translation , 2010, COLING.

[77]  Pierre Isabelle,et al.  A Challenge Set Approach to Evaluating Machine Translation , 2017, EMNLP.

[78]  Luke S. Zettlemoyer,et al.  Broad-coverage CCG Semantic Parsing with AMR , 2015, EMNLP.

[79]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[80]  Yue Zhang,et al.  Semantic Neural Machine Translation Using AMR , 2019, TACL.

[81]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[82]  Pascale Fung,et al.  Semantic Roles for SMT: A Hybrid Two-Pass Model , 2009, NAACL.

[83]  Shujian Huang,et al.  Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder , 2017, ACL.

[84]  Nicola De Cao,et al.  Question Answering by Reasoning Across Documents with Graph Convolutional Networks , 2018, NAACL.

[85]  Gholamreza Haffari,et al.  Graph-to-Sequence Learning using Gated Graph Neural Networks , 2018, ACL.

[86]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[87]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[88]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[89]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[90]  Yue Zhang,et al.  Sentence-State LSTM for Text Representation , 2018, ACL.

[91]  Yubao Liu,et al.  Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning , 2019, IJCAI.

[92]  Guntis Barzdins,et al.  RIGOTRIO at SemEval-2017 Task 9: Combining Machine Learning and Grammar Engineering for AMR Parsing and Generation , 2017, SemEval@ACL.

[93]  Ralph Grishman,et al.  Improving Event Detection with Abstract Meaning Representation , 2015 .

[94]  Hoifung Poon,et al.  Distant Supervision for Relation Extraction beyond the Sentence Boundary , 2016, EACL.

[95]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[96]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[97]  Jonathan Berant,et al.  The Web as a Knowledge-Base for Answering Complex Questions , 2018, NAACL.

[98]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[99]  Yoav Goldberg,et al.  Towards String-To-Tree Neural Machine Translation , 2017, ACL.

[100]  Naoaki Okazaki,et al.  Neural Headline Generation on Abstract Meaning Representation , 2016, EMNLP.

[101]  Yue Zhang,et al.  AMR-to-text Generation with Synchronous Node Replacement Grammar , 2017, ACL.

[102]  Preslav Nakov,et al.  SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.

[103]  Mark Dredze,et al.  Improved Relation Extraction with Feature-Rich Compositional Embedding Models , 2015, EMNLP.

[104]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[105]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[106]  Heng Ji,et al.  Incremental Joint Extraction of Entity Mentions and Relations , 2014, ACL.

[107]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[108]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[109]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[110]  Wei Lu,et al.  Better Transition-Based AMR Parsing with a Refined Search Space , 2018, EMNLP.

[111]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[112]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[113]  Lawrence Carin,et al.  Semi-Supervised Classification , 2004, Encyclopedia of Database Systems.

[114]  Mark Stevenson,et al.  Extracting Relations Within and Across Sentences , 2011, RANLP.

[115]  Zhiguo Wang,et al.  Multi-Perspective Context Matching for Machine Comprehension , 2016, ArXiv.

[116]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[117]  Andrew McCallum,et al.  Learning Field Compatibilities to Extract Database Records from Unstructured Text , 2006, EMNLP.

[118]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[119]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[120]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[121]  Cao Xiao,et al.  FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling , 2018, ICLR.

[122]  Raymond J. Mooney,et al.  Learning for Semantic Parsing with Statistical Machine Translation , 2006, NAACL.

[123]  Yue Zhang,et al.  N-ary Relation Extraction using Graph-State LSTM , 2018, EMNLP.

[124]  Ming Zhou,et al.  Improved Neural Machine Translation with Source Syntax , 2017, IJCAI.

[125]  Yue Zhang,et al.  A Graph-to-Sequence Model for AMR-to-Text Generation , 2018, ACL.

[126]  Kyunghyun Cho,et al.  SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[127]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[128]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[129]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[130]  Chris Callison-Burch,et al.  Modality and Negation in SIMT Use of Modality and Negation in Semantically-Informed Syntactic MT , 2012, CL.

[131]  Ivan Titov,et al.  AMR Parsing as Graph Prediction with Latent Alignment , 2018, ACL.

[132]  Chuan Wang,et al.  Getting the Most out of AMR Parsing , 2017, EMNLP.

[133]  Yue Zhang,et al.  AMR-to-text generation as a Traveling Salesman Problem , 2016, EMNLP.

[134]  Daniel Gildea,et al.  SemBleu: A Robust Metric for AMR Parsing Evaluation , 2019, ACL.

[135]  Chitta Baral,et al.  Addressing a Question Answering Challenge by Combining Statistical Methods with Inductive Rule Learning and Reasoning , 2016, AAAI.

[136]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[137]  Phil Blunsom,et al.  Robust Incremental Neural Semantic Graph Parsing , 2017, ACL.

[138]  Jaime G. Carbonell,et al.  CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss , 2016, *SEMEVAL.

[139]  Daniel Marcu,et al.  Parsing English into Abstract Meaning Representation Using Syntax-Based Machine Translation , 2015, EMNLP.

[140]  Zhengdong Lu,et al.  Neural Enquirer: Learning to Query Tables in Natural Language , 2016, IEEE Data Eng. Bull..

[141]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[142]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[143]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.