Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank

Discourse parsing has long been treated as a stand-alone problem independent from constituency or dependency parsing. Most attempts at this problem are pipelined rather than end-to-end, sophisticated, and not self-contained: they assume gold-standard text segmentations (Elementary Discourse Units), and use external parsers for syntactic features. In this paper we propose the first end-to-end discourse parser that jointly parses in both syntax and discourse levels, as well as the first syntacto-discourse treebank by integrating the Penn Treebank with the RST Treebank. Built upon our recent span-based constituency parser, this joint syntacto-discourse parser requires no preprocessing whatsoever (such as segmentation or feature extraction), achieves the state-of-the-art end-to-end discourse parsing accuracy.

[1]  Martin Chodorow,et al.  Holistic Discourse Coherence Annotation for Noisy Essay Writing , 2013, Dialogue Discourse.

[2]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[3]  Parminder Bhatia,et al.  Better Document-level Sentiment Analysis from RST Discourse Parsing , 2015, EMNLP.

[4]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[5]  Akira Shimazu,et al.  A Reranking Model for Discourse Segmentation using Subtree Features , 2012, SIGDIAL Conference.

[6]  Joakim Nivre,et al.  Training Deterministic Parsers with Non-Deterministic Oracles , 2013, TACL.

[7]  Shafiq R. Joty,et al.  Discriminative Reranking of Discourse Parses Using Tree Kernels , 2014, EMNLP.

[8]  Gerardo Sierra,et al.  A Symbolic Approach for Automatic Detection of Nuclearity and Rhetorical Relations among Intra-sentence Discourse Segments in Spanish , 2012, CICLing.

[9]  Shafiq R. Joty,et al.  Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis , 2013, ACL.

[10]  Eduard H. Hovy,et al.  Recursive Deep Models for Discourse Parsing , 2014, EMNLP.

[11]  Daniel Marcu,et al.  The rhetorical parsing of unrestricted texts: a surface-based approach , 2000, CL.

[12]  Lise Getoor,et al.  Supervised and Unsupervised Methods in Employing Discourse Relations for Improving Opinion Polarity Classification , 2009, EMNLP.

[13]  Masaaki Nagata,et al.  Dependency-based Discourse Parser for Single-Document Summarization , 2014, EMNLP.

[14]  Mitsuru Ishizuka,et al.  HILDA: A Discourse Parser Using Support Vector Machine Classification , 2010, Dialogue Discourse.

[15]  Maria das Graças Volpe Nunes,et al.  On the Development and Evaluation of a Brazilian Portuguese Discourse Parser , 2008, RITA.

[16]  Maite Taboada,et al.  Not All Words Are Created Equal: Extracting Semantic Orientation as a Function of Adjective Relevance , 2007, Australian Conference on Artificial Intelligence.

[17]  Peter Jansen,et al.  Discourse Complements Lexical Semantics for Non-factoid Answer Reranking , 2014, ACL.

[18]  Ani Nenkova,et al.  Discourse indicators for content selection in summarization , 2010, SIGDIAL Conference.

[19]  Houfeng Wang,et al.  A Two-Stage Parsing Method for Text-Level Discourse Analysis , 2017, ACL.

[20]  Daniel Jurafsky,et al.  Neural Net Models of Open-domain Discourse Coherence , 2016, EMNLP.

[21]  Kenji Sagae,et al.  Fast Rhetorical Structure Theory Discourse Parsing , 2015, ArXiv.

[22]  Graeme Hirst,et al.  A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing , 2014, ACL.

[23]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[24]  Daniel Marcu,et al.  Sentence Level Discourse Parsing using Syntactic and Lexical Information , 2003, NAACL.

[25]  Jacob Eisenstein,et al.  Representation Learning for Text-level Discourse Parsing , 2014, ACL.

[26]  Liang Wang,et al.  Text-level Discourse Dependency Parsing , 2014, ACL.

[27]  James Cross,et al.  Incremental Parsing with Minimal Features Using Bi-Directional LSTM , 2016, ACL.