Top-Down RST Parsing Utilizing Granularity Levels in Documents

Some downstream NLP tasks exploit discourse dependency trees converted from RST trees. To obtain better discourse dependency trees, we need to improve the accuracy of RST trees at the upper parts of the structures. Thus, we propose a novel neural top-down RST parsing method. Then, we exploit three levels of granularity in a document, paragraphs, sentences and Elementary Discourse Units (EDUs), to parse a document accurately and efficiently. The parsing is done in a top-down manner for each granularity level, by recursively splitting a larger text span into two smaller ones while predicting nuclearity and relation labels for the divided spans. The results on the RST-DT corpus show that our method achieved the state-of-the-art results, 87.0 unlabeled span score, 74.6 nuclearity labeled span score, and the comparable result with the state-of-the-art, 60.0 relation labeled span score. Furthermore, discourse dependency trees converted from our RST trees also achieved the state-of-the-art results, 64.9 unlabeled attachment score and 48.5 labeled attachment score. Introduction A discourse structure of a document can be represented as a tree, like a syntactic structure of a sentence being represented as a tree. Rhetorical structure theory (RST) (Mann and Thompson 1987) is well-known and has been widely studied for representing a document as a tree. An RST tree is a type of constituent tree, whose terminal nodes (leaves) are elementary discourse units (EDUs), clause-like units, and whose non-terminal nodes represent the nuclearity status, nucleus or satellite, for the span that consists of a sequence of EDUs or a single EDU. The span dominated by a nucleus is more essential than the one dominated by a satellite. That is, the satellite has a role of supporting the nucleus. Furthermore, rhetorical relations are defined between two adjacent spans. A mono-nuclear relation, such as ”Elaboration” or ”Condition”, is assigned between a nucleus and its satellite, and a multi-nuclear relation, such as ”Same-unit” or ”Topicchange”, is assigned between two nuclei. RST trees have important roles in natural language processing (NLP) tasks, such as summarization (Marcu 1998; Copyright c © 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Elaboration

[1]  Graeme Hirst,et al.  A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing , 2014, ACL.

[2]  Parminder Bhatia,et al.  Better Document-level Sentiment Analysis from RST Discourse Parsing , 2015, EMNLP.

[3]  Mirella Lapata,et al.  Learning Contextually Informed Representations for Linear-Time Discourse Parsing , 2017, EMNLP.

[4]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[5]  Qi Li,et al.  Discourse Parsing with Attention-based Hierarchical Neural Networks , 2016, EMNLP.

[6]  Nicholas Asher,et al.  How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT , 2017, EMNLP.

[7]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[8]  Kenji Sagae,et al.  Fast Rhetorical Structure Theory Discourse Parsing , 2015, ArXiv.

[9]  Shafiq R. Joty,et al.  Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis , 2013, ACL.

[10]  M. Rey Improving summarization through rhetorical parsing tuning , 1998 .

[11]  Peter Jansen,et al.  Discourse Complements Lexical Semantics for Non-factoid Answer Reranking , 2014, ACL.

[12]  Nan Yu,et al.  Transition-based Neural RST Parsing with Implicit Syntax Features , 2018, COLING.

[13]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[16]  Yoshua Bengio,et al.  Straight to the Tree: Constituency Parsing with Neural Syntactic Distance , 2018, ACL.

[17]  Masaaki Nagata,et al.  Single-Document Summarization as a Tree Knapsack Problem , 2013, EMNLP.

[18]  Liang Huang,et al.  Linear-Time Constituency Parsing with RNNs and Dynamic Programming , 2018, ACL.

[19]  Masaaki Nagata,et al.  Empirical comparison of dependency conversions for RST discourse trees , 2016, SIGDIAL Conference.

[20]  Eduard H. Hovy,et al.  Recursive Deep Models for Discourse Parsing , 2014, EMNLP.

[21]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[22]  Graeme Hirst,et al.  Text-level Discourse Parsing with Rich Linguistic Features , 2012, ACL.

[23]  Dan Klein,et al.  A Minimal Span-Based Neural Constituency Parser , 2017, ACL.

[24]  Andrew McCallum,et al.  Linguistically-Informed Self-Attention for Semantic Role Labeling , 2018, EMNLP.

[25]  Noah A. Smith,et al.  Neural Discourse Structure for Text Categorization , 2017, ACL.

[26]  Hiroyuki Shindo,et al.  A Span Selection Model for Semantic Role Labeling , 2018, EMNLP.

[27]  Shafiq R. Joty,et al.  A Unified Linear-Time Framework for Sentence-Level Discourse Parsing , 2019, ACL.

[28]  William C. Mann,et al.  RHETORICAL STRUCTURE THEORY: A THEORY OF TEXT ORGANIZATION , 1987 .

[29]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[30]  Houfeng Wang,et al.  A Two-Stage Parsing Method for Text-Level Discourse Analysis , 2017, ACL.

[31]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[32]  Yizhong Wang,et al.  Toward Fast and Accurate Neural Discourse Segmentation , 2018, EMNLP.

[33]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[34]  Helmut Prendinger,et al.  A Novel Discourse Parser Based on Support Vector Machine Classification , 2009, ACL.