Leveraging bilingually-constrained synthetic data via multi-task neural networks for implicit discourse relation recognition

Recognizing implicit discourse relations is an important but challenging task in discourse understanding. To alleviate the shortage of labeled data, previous work automatically generates synthetic implicit data (SynData) as additional training data, by removing connectives from explicit discourse instances. Although SynData has been proven useful for implicit discourse relation recognition, it also has the meaning shift problem and the domain problem. In this paper, we first propose to use bilingually-constrained synthetic implicit data (BiSynData) to enrich the training data, which can alleviate the drawbacks of SynData. Our BiSynData is constructed from a bilingual sentence-aligned corpus according to the implicit/explicit mismatch between different languages. Then we design a multi-task neural network model to incorporate our BiSynData to benefit implicit discourse relation recognition. Experimental results on both the English PDTB and Chinese CDTB data sets show that our proposed method achieves significant improvements over baselines using SynData.

[1]  Xiaodong Liu,et al.  Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval , 2015, NAACL.

[2]  Yang Liu,et al.  Implicit Discourse Relation Classification via Multi-Task Neural Networks , 2016, AAAI.

[3]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[4]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[5]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[6]  Daniel Marcu,et al.  Sentence Level Discourse Parsing using Syntactic and Lexical Information , 2003, NAACL.

[7]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[8]  Yuping Zhou,et al.  PDTB-style Discourse Annotation of Chinese Text , 2012, ACL.

[9]  Nianwen Xue,et al.  Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives , 2015, NAACL.

[10]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[11]  Jian Su,et al.  Predicting Discourse Connectives for Implicit Discourse Relation Recognition , 2010, COLING.

[12]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[13]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[14]  Jacob Eisenstein,et al.  Closing the Gap: Domain Adaptation from Explicit to Implicit Discourse Relations , 2015, EMNLP.

[15]  Kathleen McKeown,et al.  Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation , 2013, ACL.

[16]  Pascal Denis,et al.  Comparing Word Representations for Implicit Discourse Relation Classification , 2015, EMNLP.

[17]  Yaojie Lu,et al.  Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition , 2015, EMNLP.

[18]  Ani Nenkova,et al.  Using entity features to classify implicit discourse relations , 2010, SIGDIAL Conference.

[19]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[20]  Nianwen Xue,et al.  Discovering Implicit Discourse Relations Through Brown Cluster Pair Representation and Coreference Patterns , 2014, EACL.

[21]  Reid G. Simmons,et al.  Spectral Semi-Supervised Discourse Relation Classification , 2015, ACL.

[22]  Ani Nenkova,et al.  Automatic sense prediction for implicit discourse relations in text , 2009, ACL.

[23]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[24]  Daniel Marcu,et al.  An Unsupervised Approach to Recognizing Discourse Relations , 2002, ACL.

[25]  Pascal Denis,et al.  Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification , 2014, COLING.

[26]  Xuanjing Huang,et al.  Implicit Discourse Relation Detection via a Deep Architecture with Gated Relevance Network , 2016, ACL.

[27]  Ani Nenkova,et al.  Using Syntax to Disambiguate Explicit Discourse Connectives in Text , 2009, ACL.

[28]  Fang Kong,et al.  Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure , 2014, EMNLP.

[29]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[30]  Jacob Eisenstein,et al.  One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations , 2014, TACL.

[31]  Danushka Bollegala,et al.  A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension , 2010, EMNLP.

[32]  Hwee Tou Ng,et al.  Recognizing Implicit Discourse Relations in the Penn Discourse Treebank , 2009, EMNLP.

[33]  Junyi Jessy Li,et al.  Cross-lingual Discourse Relation Analysis: A corpus study and a semi-supervised classification system , 2014, COLING.

[34]  Zheng-Yu Niu,et al.  Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition , 2013, ACL.

[35]  Hwee Tou Ng,et al.  A PDTB-styled end-to-end discourse parser , 2012, Natural Language Engineering.

[36]  Alex Lascarides,et al.  Edinburgh Research Explorer Using automatically labelled examples to classify rhetorical relations: an assessment , 2022 .

[37]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[38]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[39]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.