Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition

To overcome the shortage of labeled data for implicit discourse relation recognition, previous works attempted to automatically generate training data by removing explicit discourse connectives from sentences and then built models on these synthetic implicit examples. However, a previous study (Sporleder and Lascarides, 2008) showed that models trained on these synthetic data do not generalize very well to natural (i.e. genuine) implicit discourse data. In this work we revisit this issue and present a multi-task learning based system which can effectively use synthetic data for implicit discourse relation recognition. Results on PDTB data show that under the multi-task learning framework our models with the use of the prediction of explicit discourse connectives as auxiliary learning tasks, can achieve an averaged F1 improvement of 5.86% over baseline models.

[1]  Daniel Marcu,et al.  An Unsupervised Approach to Recognizing Discourse Relations , 2002, ACL.

[2]  Ani Nenkova,et al.  Easily Identifiable Discourse Relations , 2008, COLING.

[3]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[4]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[5]  Hwee Tou Ng,et al.  Recognizing Implicit Discourse Relations in the Penn Discourse Treebank , 2009, EMNLP.

[6]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[7]  Danushka Bollegala,et al.  A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension , 2010, EMNLP.

[8]  Lou Boves,et al.  Evaluating discourse-based answer extraction for why-question answering , 2007, SIGIR.

[9]  Sasha J. Blair-Goldensohn,et al.  Long-answer question answering and rhetorical-semantic relations , 2007 .

[10]  Jian Su,et al.  Predicting Discourse Connectives for Implicit Discourse Relation Recognition , 2010, COLING.

[11]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[12]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[13]  Ani Nenkova,et al.  Automatic sense prediction for implicit discourse relations in text , 2009, ACL.

[14]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[15]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[16]  Daniel Marcu,et al.  Sentence Level Discourse Parsing using Syntactic and Lexical Information , 2003, NAACL.

[17]  Satoshi Sekine,et al.  Using Phrasal Patterns to Identify Discourse Relations , 2006, HLT-NAACL.

[18]  G. Meade Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001 .

[19]  Eugene Charniak,et al.  BLLIP North American news text , 2008 .

[20]  Tony Jebara,et al.  Multi-task feature and kernel selection for SVMs , 2004, ICML.

[21]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[22]  Janyce Wiebe,et al.  Articles: Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009, CL.

[23]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[24]  Livio Robaldo,et al.  The Penn Discourse Treebank 2.0 Annotation Manual , 2007 .

[25]  Alex Lascarides,et al.  Edinburgh Research Explorer Using automatically labelled examples to classify rhetorical relations: an assessment , 2022 .

[26]  James Pustejovsky,et al.  Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources , 2006, SIGDIAL Workshop.

[27]  Danushka Bollegala,et al.  Semi-supervised Discourse Relation Classification with Structural Learning , 2011, CICLing.

[28]  Charles A. Micchelli,et al.  A Spectral Regularization Framework for Multi-Task Structure Learning , 2007, NIPS.

[29]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[30]  Chu-Ren Huang,et al.  22nd International Conference on Computational Linguistics , 2008 .

[31]  Uwe Reyle,et al.  Ontology-driven discourse analysis for information extraction , 2005, Data Knowl. Eng..

[32]  Tong Zhang,et al.  A High-Performance Semi-Supervised Learning Method for Text Chunking , 2005, ACL.

[33]  Jian Su,et al.  Kernel Based Discourse Relation Recognition with Temporal Ordering Information , 2010, ACL.