Learning Clause Representation from Dependency-Anchor Graph for Connective Prediction

Semantic representation that supports the choice of an appropriate connective between pairs of clauses inherently addresses discourse coherence, which is important for tasks such as narrative understanding, argumentation, and discourse parsing. We propose a novel clause embedding method that applies graph learning to a data structure we refer to as a dependency-anchor graph. The dependency anchor graph incorporates two kinds of syntactic information, constituency structure, and dependency relations, to highlight the subject and verb phrase relation. This enhances coherence-related aspects of representation. We design a neural model to learn a semantic representation for clauses from graph convolution over latent representations of the subject and verb phrase. We evaluate our method on two new datasets: a subset of a large corpus where the source texts are published novels, and a new dataset collected from students’ essays. The results demonstrate a significant improvement over tree-based models, confirming the importance of emphasizing the subject and verb phrase. The performance gap between the two datasets illustrates the challenges of analyzing student’s written text, plus a potential evaluation task for coherence modeling and an application for suggesting revisions to students.

[1]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[2]  Bonnie L. Webber,et al.  Anchoring a Lexicalized Tree-Adjoining Grammar for Discourse , 1998, ArXiv.

[3]  Zhi Jin,et al.  Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths , 2015, EMNLP.

[4]  Dolores Perin,et al.  Assessing Text-Based Writing of Low-Skilled College Students , 2018, International Journal of Artificial Intelligence in Education.

[5]  Nianwen Xue,et al.  Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives , 2015, NAACL.

[6]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[7]  Shashi Narayan,et al.  Split and Rephrase , 2017, EMNLP.

[8]  Diane J. Litman,et al.  Annotation and Classification of Sentence-level Revision Improvement , 2018, BEA@NAACL-HLT.

[9]  Camille Pradel,et al.  Mining Discourse Markers for Unsupervised Sentence Representation Learning , 2019, NAACL.

[10]  Noah D. Goodman,et al.  DisSent: Learning Sentence Representations from Explicit Discourse Relations , 2019, ACL.

[11]  Christopher D. Manning,et al.  Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks , 2016, LREC.

[12]  Kentaro Inui,et al.  An Empirical Study of Span Representations in Argumentation Structure Parsing , 2019, ACL.

[13]  David J. Weir,et al.  Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics , 2016, CL.

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  Daniele Pighin,et al.  Automatic Prediction of Discourse Connectives , 2018, LREC.

[16]  Diego Marcheggiani,et al.  Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling , 2017, EMNLP.

[17]  Eduard H. Hovy,et al.  When Are Tree Structures Necessary for Deep Learning of Representations? , 2015, EMNLP.

[18]  Robert E. Mercer,et al.  Improving Tree-LSTM with Tree Attention , 2019, 2019 IEEE 13th International Conference on Semantic Computing (ICSC).

[19]  Rashmi Prasad,et al.  The Penn Discourse Treebank , 2004, LREC.

[20]  Robert D. van Valin,et al.  An Introduction to Syntax , 2001 .

[21]  Hung-Yu Kao,et al.  Probing Neural Network Comprehension of Natural Language Arguments , 2019, ACL.

[22]  Ruihong Huang,et al.  Building Context-aware Clause Representations for Situation Entity Type Classification , 2018, EMNLP.

[23]  Yohan Jo,et al.  Machine-Aided Annotation for Fine-Grained Proposition Types in Argumentation , 2020, LREC.

[24]  Dan Goldwasser,et al.  Multi-Relational Script Learning for Discourse Relations , 2019, ACL.

[25]  Diane J. Litman,et al.  Instant Feedback for Increasing the Presence of Solutions in Peer Reviews , 2016, HLT-NAACL Demos.

[26]  Jacob Eisenstein,et al.  Representation Learning for Text-level Discourse Parsing , 2014, ACL.

[27]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[28]  Yizhong Wang,et al.  Toward Fast and Accurate Neural Discourse Segmentation , 2018, EMNLP.

[29]  Kathleen McKeown,et al.  Content Selection in Deep Learning Models of Summarization , 2018, EMNLP.

[30]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[31]  Chris Fournier,et al.  Evaluating Text Segmentation using Boundary Edit Distance , 2013, ACL.

[32]  Douwe Kiela,et al.  SentEval: An Evaluation Toolkit for Universal Sentence Representations , 2018, LREC.

[33]  Nianwen Xue,et al.  A Systematic Study of Neural Discourse Models for Implicit Discourse Relation , 2017, EACL.

[34]  Rachel Rudinger,et al.  Lexicosyntactic Inference in Neural Models , 2018, EMNLP.

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Jing Li,et al.  SegBot: A Generic Neural Text Segmentation Model with Pointer Network , 2018, IJCAI.

[37]  Houfeng Wang,et al.  A Two-Stage Parsing Method for Text-Level Discourse Analysis , 2017, ACL.

[38]  Afsaneh Fazly,et al.  How coherent are neural models of coherence? , 2020, COLING.

[39]  Jacob Eisenstein,et al.  One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations , 2014, TACL.

[40]  A. Graesser,et al.  The impact of connectives on the memory for expository texts , 1993 .

[41]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[42]  Claire Cardie,et al.  Identifying Appropriate Support for Propositions in Online User Comments , 2014, ArgMining@ACL.

[43]  Michael Wiegand,et al.  Opinion Holder and Target Extraction for Verb-based Opinion Predicates - The Problem is Not Solved , 2015, WASSA@EMNLP.

[44]  D. Kuhn,et al.  Tracing the Development of Argumentive Writing in a Discourse-Rich Context , 2016 .