Multi-view and multi-task training of RST discourse parsers

We experiment with different ways of training LSTM networks to predict RST discourse trees. The main challenge for RST discourse parsing is the limited amounts of training data. We combat this by regularizing our models using task supervision from related tasks as well as alternative views on discourse structures. We show that a simple LSTM sequential discourse parser takes advantage of this multi-view and multi-task framework with 12-15% error reductions over our baseline (depending on the metric) and results that rival more complex state-of-the-art parsers.

[1]  Barbara Plank,et al.  Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss , 2016, ACL.

[2]  Sigrid Klerke,et al.  Improving sentence compression by learning to predict gaze , 2016, NAACL.

[3]  Parminder Bhatia,et al.  Better Document-level Sentiment Analysis from RST Discourse Parsing , 2015, EMNLP.

[4]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[5]  Wei Xu,et al.  End-to-end learning of semantic role labeling using recurrent neural networks , 2015, ACL.

[6]  Jacob Eisenstein,et al.  One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations , 2014, TACL.

[7]  Jacob Eisenstein,et al.  Representation Learning for Text-level Discourse Parsing , 2014, ACL.

[8]  Liang Wang,et al.  Text-level Discourse Dependency Parsing , 2014, ACL.

[9]  Graeme Hirst,et al.  A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing , 2014, ACL.

[10]  Nianwen Xue,et al.  Discovering Implicit Discourse Relations Through Brown Cluster Pair Representation and Coreference Patterns , 2014, EACL.

[11]  Fuzhen Zhuang,et al.  Shared Structure Learning for Multiple Tasks with Multiple Views , 2013, ECML/PKDD.

[12]  Maite Taboada,et al.  Annotation upon Annotation: Adding Signalling Information to a Corpus of Discourse Relations , 2013, Dialogue Discourse.

[13]  Zheng-Yu Niu,et al.  Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition , 2013, ACL.

[14]  Steven Skiena,et al.  Polyglot: Distributed Word Representations for Multilingual NLP , 2013, CoNLL.

[15]  James J. Masanz,et al.  Language Processing , 2019, Encyclopedia of Autism Spectrum Disorders.

[16]  Pascal Denis,et al.  Constrained Decoding for Text-Level Discourse Parsing , 2012, COLING.

[17]  Shafiq R. Joty,et al.  A Novel Discriminative Framework for Sentence-Level Discourse Analysis , 2012, EMNLP.

[18]  Graeme Hirst,et al.  Text-level Discourse Parsing with Rich Linguistic Features , 2012, ACL.

[19]  Akira Shimazu,et al.  A Reranking Model for Discourse Segmentation using Subtree Features , 2012, SIGDIAL Conference.

[20]  Bonnie L. Webber,et al.  Discourse Processing , 2011, NAACL.

[21]  Owen Rambow,et al.  Discourse Relations and Propositional Attitudes , 2011 .

[22]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[23]  Mitsuru Ishizuka,et al.  HILDA: A Discourse Parser Using Support Vector Machine Classification , 2010, Dialogue Discourse.

[24]  Ani Nenkova,et al.  Discourse indicators for content selection in summarization , 2010, SIGDIAL Conference.

[25]  Hwee Tou Ng,et al.  Recognizing Implicit Discourse Relations in the Penn Discourse Treebank , 2009, EMNLP.

[26]  Ani Nenkova,et al.  Automatic sense prediction for implicit discourse relations in text , 2009, ACL.

[27]  James Pustejovsky,et al.  FactBank: a corpus annotated with event factuality , 2009, Lang. Resour. Evaluation.

[28]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[29]  Maite Taboada,et al.  Applications of Rhetorical Structure Theory , 2006 .

[30]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[31]  Mirella Lapata,et al.  Discourse Chunking and its Application to Sentence Compression , 2005, HLT.

[32]  James Pustejovsky,et al.  Temporal and Event Information in Natural Language Text , 2005, Lang. Resour. Evaluation.

[33]  Christian R. Huyck,et al.  Generating discourse structures for written texts , 2004, COLING 2004.

[34]  Daniel Marcu,et al.  A Noisy-Channel Model for Document Compression , 2002, ACL.

[35]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[36]  Daniel Marcu,et al.  The rhetorical parsing of unrestricted texts: a surface-based approach , 2000, CL.

[37]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[38]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[39]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[40]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[41]  Ming Yang,et al.  Bidirectional Long Short-Term Memory Networks for Relation Classification , 2015, PACLIC.

[42]  Lynnelle Rhinier Brown,et al.  Requesting the Context: A Context Analysis of Let Statement and If Statement Requests and Commands in the Santa Barbara Corpus of Spoken American English , 2014 .

[43]  Alex Lascarides,et al.  Logics of Conversation , 2005, Studies in natural language processing.

[44]  Christopher Culy,et al.  Hybrid Text Summarization: Combining External Relevance Measures with Structural Analysis , 2004 .

[45]  Christian R. Huyck,et al.  Generating Discourse Structures for Written Text , 2004, COLING.

[46]  Daniel Marcu,et al.  Evaluating Multiple Aspects of Coherence in Student Essays , 2004, NAACL.

[47]  Daniel Marcu,et al.  Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays , 2003, IEEE Intell. Syst..

[48]  Daniel Marcu,et al.  The rhetorical parsing, summarization, and generation of natural language texts , 1998 .