Joint Modeling of Recognizing Macro Chinese Discourse Nuclearity and Relation Based on Structure and Topic Gated Semantic Network

Nowadays, in the Natural Language Processing field, with the object of research gradually shifting from the word to sentence, paragraph and higher semantic units, discourse analysis is one crucial step toward a better understanding of how these articles are structured. Compared with micro-level, this has rarely been investigated in macro Chinese discourse analysis and faces tremendous challenges. First, it is harder to grasp the topic and recognize the relationship between macro discourse units due to their longer length and looser relation between them. Second, how to mine the relationship between nuclearity and relation recognition effectively is another challenge. To address these challenges, we propose a joint model of recognizing macro Chinese discourse nuclearity and relation based on Structure and Topic Gated Semantic Network (STGSN). It makes the semantic representation of a discourse unit can change with its position and the topic by Gated Linear Unit (GLU). Moreover, we analyze the results of our models in nuclearity and relation recognition and explore the potential relationship between them. Conducted experiments show the effectiveness of the proposed approach.

[1]  Shafiq R. Joty,et al.  A Novel Discriminative Framework for Sentence-Level Discourse Analysis , 2012, EMNLP.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Fang Kong,et al.  Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure , 2014, EMNLP.

[4]  Dongyan Zhao,et al.  Modeling discourse cohesion for discourse parsing via memory network , 2018, ACL.

[5]  Houfeng Wang,et al.  A Two-Stage Parsing Method for Text-Level Discourse Analysis , 2017, ACL.

[6]  Graeme Hirst,et al.  A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing , 2014, ACL.

[7]  Alex Lascarides,et al.  Combining Hierarchical Clustering and Machine Learning to Predict High-Level Discourse Structure , 2004, COLING.

[8]  Guodong Zhou,et al.  MCDTB: A Macro-level Chinese Discourse TreeBank , 2018, COLING.

[9]  Guodong Zhou,et al.  Joint Modeling of Structure Identification and Nuclearity Recognition in Macro Chinese Discourse Treebank , 2018, COLING.

[10]  Shafiq R. Joty,et al.  Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis , 2013, ACL.

[11]  William C. Mann,et al.  RHETORICAL STRUCTURE THEORY: A THEORY OF TEXT ORGANIZATION , 1987 .

[12]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[13]  Mitsuru Ishizuka,et al.  HILDA: A Discourse Parser Using Support Vector Machine Classification , 2010, Dialogue Discourse.

[14]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[15]  Fang Kong,et al.  A CDT-Styled End-to-End Chinese Discourse Parser , 2017, NLPCC/ICCPOL.