Adversarial Learning for Discourse Rhetorical Structure Parsing

Text-level discourse rhetorical structure (DRS) parsing is known to be challenging due to the notorious lack of training data. Although recent top-down DRS parsers can better leverage global document context and have achieved certain success, the performance is still far from perfect. To our knowledge, all previous DRS parsers make local decisions for either bottomup node composition or top-down split point ranking at each time step, and largely ignore DRS parsing from the global view point. Obviously, it is not sufficient to build an entire DRS tree only through these local decisions. In this work, we present our insight on evaluating the pros and cons of the entire DRS tree for global optimization. Specifically, based on recent well-performing top-down frameworks, we introduce a novel method to transform both gold standard and predicted constituency trees into tree diagrams with two color channels. After that, we learn an adversarial bot between gold and fake tree diagrams to estimate the generated DRS trees from a global perspective. We perform experiments on both RST-DT and CDTB corpora and use the original Parseval for performance evaluation. The experimental results show that our parser can substantially improve the performance when compared with previous state-of-the-art parsers.

[1]  Nan Yu,et al.  Transition-based Neural RST Parsing with Implicit Syntax Features , 2018, COLING.

[2]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[3]  Barbara Plank,et al.  Multi-view and multi-task training of RST discourse parsers , 2016, COLING.

[4]  Anders Søgaard,et al.  Cross-lingual RST Discourse Parsing , 2017, EACL.

[5]  Peifeng Li,et al.  A Top-down Neural Architecture towards Text-level Parsing of Discourse Rhetorical Structure , 2020, ACL.

[6]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[8]  Shen Li,et al.  Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings , 2018, CCL.

[9]  Shafiq R. Joty,et al.  A Unified Linear-Time Framework for Sentence-Level Discourse Parsing , 2019, ACL.

[10]  Houfeng Wang,et al.  A Two-Stage Parsing Method for Text-Level Discourse Analysis , 2017, ACL.

[11]  Mitsuru Ishizuka,et al.  HILDA: A Discourse Parser Using Support Vector Machine Classification , 2010, Dialogue Discourse.

[12]  Jacob Eisenstein,et al.  Representation Learning for Text-level Discourse Parsing , 2014, ACL.

[13]  Nicholas Asher,et al.  How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT , 2017, EMNLP.

[14]  Fang Kong,et al.  Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure , 2014, EMNLP.

[15]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[16]  Qi Li,et al.  Discourse Parsing with Attention-based Hierarchical Neural Networks , 2016, EMNLP.

[17]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[18]  Yoav Goldberg,et al.  Adversarial Removal of Demographic Attributes from Text Data , 2018, EMNLP.

[19]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[20]  Rashmi Prasad,et al.  The Penn Discourse Treebank , 2004, LREC.

[21]  Lidong Bing,et al.  Hierarchical Pointer Net Parsing , 2019, EMNLP/IJCNLP.

[22]  Eduard H. Hovy,et al.  Recursive Deep Models for Discourse Parsing , 2014, EMNLP.

[23]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[24]  Noah A. Smith,et al.  Neural Discourse Structure for Text Categorization , 2017, ACL.

[25]  William Yang Wang,et al.  Self-Supervised Dialogue Learning , 2019, ACL.

[26]  Wanxiang Che,et al.  Revisiting Pre-Trained Models for Chinese Natural Language Processing , 2020, FINDINGS.

[27]  Timothy Baldwin,et al.  Top-down Discourse Parsing via Sequence Labelling , 2021, EACL.

[28]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[29]  Yu Cheng,et al.  Discourse-Aware Neural Extractive Text Summarization , 2020, ACL.

[30]  Shafiq R. Joty,et al.  Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis , 2013, ACL.

[31]  Stephen Clark,et al.  Neural Generative Rhetorical Structure Parsing , 2019, EMNLP.

[32]  Graeme Hirst,et al.  A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing , 2014, ACL.

[33]  Naoki Kobayashi,et al.  Top-Down RST Parsing Utilizing Granularity Levels in Documents , 2020, AAAI.

[34]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[35]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[36]  Alon Lavie,et al.  A Classifier-Based Parser with Linear Run-Time Complexity , 2005, IWPT.

[37]  Xinyu Dai,et al.  A Reinforced Generation of Adversarial Samples for Neural Machine Translation , 2019, ArXiv.

[38]  Yan-Ying Chen,et al.  Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation , 2019, ACL.

[39]  Nancy F. Chen,et al.  An End-to-End Document-Level Neural Discourse Parser Exploiting Multi-Granularity Representations , 2020, ArXiv.