Chinese Paragraph-level Discourse Parsing with Global Backward and Local Reverse Reading

Discourse structure tree construction is the fundamental task of discourse parsing and most previous work focused on English. Due to the cultural and linguistic differences, existing successful methods on English discourse parsing cannot be transformed into Chinese directly, especially in paragraph level suffering from longer discourse units and fewer explicit connectives. To alleviate the above issues, we propose two reading modes, i.e., the global backward reading and the local reverse reading, to construct Chinese paragraph level discourse trees. The former processes discourse units from the end to the beginning in a document to utilize the left-branching bias of discourse structure in Chinese, while the latter reverses the position of paragraphs in a discourse unit to enhance the differentiation of coherence between adjacent discourse units. The experimental results on Chinese MCDTB demonstrate that our model outperforms all strong baselines.

[1]  Eduard H. Hovy,et al.  Recursive Deep Models for Discourse Parsing , 2014, EMNLP.

[2]  Dongyan Zhao,et al.  Modeling discourse cohesion for discourse parsing via memory network , 2018, ACL.

[3]  Guodong Zhou,et al.  Recognizing Macro Chinese Discourse Structure on Label Degeneracy Combination Model , 2018, NLPCC.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  R. Kaplan CULTURAL THOUGHT PATTERNS IN INTER‐CULTURAL EDUCATION , 1966 .

[6]  Shafiq R. Joty,et al.  CODRA: A Novel Discriminative Framework for Rhetorical Analysis , 2015, CL.

[7]  Guodong Zhou,et al.  Joint Modeling of Structure Identification and Nuclearity Recognition in Macro Chinese Discourse Treebank , 2018, COLING.

[8]  Alex Lascarides,et al.  Combining Hierarchical Clustering and Machine Learning to Predict High-Level Discourse Structure , 2004, COLING.

[9]  Jacob Eisenstein,et al.  Representation Learning for Text-level Discourse Parsing , 2014, ACL.

[10]  Stephen Clark,et al.  Neural Generative Rhetorical Structure Parsing , 2019, EMNLP.

[11]  Yu Cheng,et al.  Discourse-Aware Neural Extractive Text Summarization , 2020, ACL.

[12]  Anders Søgaard,et al.  Cross-lingual RST Discourse Parsing , 2017, EACL.

[13]  Lidong Bing,et al.  Hierarchical Pointer Net Parsing , 2019, EMNLP/IJCNLP.

[14]  Maite Taboada,et al.  Constructive Language in News Comments , 2017, ALW@ACL.

[15]  Mitsuru Ishizuka,et al.  HILDA: A Discourse Parser Using Support Vector Machine Classification , 2010, Dialogue Discourse.

[16]  Parminder Bhatia,et al.  Better Document-level Sentiment Analysis from RST Discourse Parsing , 2015, EMNLP.

[17]  Yuping Zhou,et al.  The Chinese Discourse TreeBank: a Chinese corpus annotated with discourse relations , 2015, Lang. Resour. Evaluation.

[18]  Amir Zeldes,et al.  The GUM corpus: creating multilayer resources in the classroom , 2016, Language Resources and Evaluation.

[19]  Kenji Sagae,et al.  Analysis of Discourse Structure with Syntactic Dependencies and Data-Driven Shift-Reduce Parsing , 2009, IWPT.

[20]  Peifeng Li,et al.  Joint Modeling of Recognizing Macro Chinese Discourse Nuclearity and Relation Based on Structure and Topic Gated Semantic Network , 2019, NLPCC.

[21]  Barbara Di Eugenio,et al.  An effective Discourse Parser that uses Rich Linguistic Information , 2009, NAACL.

[22]  Shafiq R. Joty,et al.  Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis , 2013, ACL.

[23]  Naoki Kobayashi,et al.  Split or Merge: Which is Better for Unsupervised RST Parsing? , 2019, EMNLP.

[24]  Barbara Plank,et al.  Multi-view and multi-task training of RST discourse parsers , 2016, COLING.

[25]  Naoki Kobayashi,et al.  Top-Down RST Parsing Utilizing Granularity Levels in Documents , 2020, AAAI.

[26]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[27]  Robert E. Longacre,et al.  The Paragraph as a Grammatical Unit , 1979 .

[28]  Fang Kong,et al.  Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure , 2014, EMNLP.

[29]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[30]  Alon Lavie,et al.  Parser Combination by Reparsing , 2006, NAACL.

[31]  Farid Meziane,et al.  A Discourse-Based Approach for Arabic Question Answering , 2016, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[32]  Houfeng Wang,et al.  A Two-Stage Parsing Method for Text-Level Discourse Analysis , 2017, ACL.

[33]  Fang Kong,et al.  A CDT-Styled End-to-End Chinese Discourse Parser , 2016, NLPCC/ICCPOL.

[34]  Graeme Hirst,et al.  A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing , 2014, ACL.

[35]  Guodong Zhou,et al.  MCDTB: A Macro-level Chinese Discourse TreeBank , 2018, COLING.

[36]  Nazli Goharian,et al.  Scientific document summarization via citation contextualization and scientific discourse , 2017, International Journal on Digital Libraries.

[37]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[38]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[39]  Nicholas Asher,et al.  How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT , 2017, EMNLP.

[40]  Shafiq R. Joty,et al.  A Unified Linear-Time Framework for Sentence-Level Discourse Parsing , 2019, ACL.

[41]  William C. Mann,et al.  RHETORICAL STRUCTURE THEORY: A THEORY OF TEXT ORGANIZATION , 1987 .

[42]  Yi Zhou,et al.  Constructing Chinese Macro Discourse Tree via Multiple Views and Word Pair Similarity , 2019, NLPCC.

[43]  G. Sampson Depth in English grammar , 1997, Journal of Linguistics.

[44]  Qi Li,et al.  Discourse Parsing with Attention-based Hierarchical Neural Networks , 2016, EMNLP.

[45]  Nan Yu,et al.  Transition-based Neural RST Parsing with Implicit Syntax Features , 2018, COLING.