Addressing Class Imbalance for Improved Recognition of Implicit Discourse Relations

In this paper we address the problem of skewed class distribution in implicit discourse relation recognition. We examine the performance of classifiers for both binary classification predicting if a particular relation holds or not and for multi-class prediction. We review prior work to point out that the problem has been addressed differently for the binary and multi-class problems. We demonstrate that adopting a unified approach can significantly improve the performance of multi-class prediction. We also propose an approach that makes better use of the full annotations in the training set when downsampling is used. We report significant absolute improvements in performance in multi-class prediction, as well as significant improvement of binary classifiers for detecting the presence of implicit Temporal, Comparison and Contingency relations.

[1]  Hwee Tou Ng,et al.  A PDTB-styled end-to-end discourse parser , 2012, Natural Language Engineering.

[2]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[3]  Guodong Zhou,et al.  Cross-argument inference for implicit discourse relation recognition , 2012, CIKM '12.

[4]  Jason Baldridge,et al.  Discourse Connective Argument Identification with Connective Specific Rankers , 2008, 2008 IEEE International Conference on Semantic Computing.

[5]  George Forman,et al.  Feature shaping for linear SVM classifiers , 2009, KDD.

[6]  Kathleen McKeown,et al.  Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation , 2013, ACL.

[7]  Livio Robaldo,et al.  Sense Annotation in the Penn Discourse Treebank , 2008, CICLing.

[8]  Hwee Tou Ng,et al.  Recognizing Implicit Discourse Relations in the Penn Discourse Treebank , 2009, EMNLP.

[9]  Katharina Morik,et al.  Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring , 1999, ICML.

[10]  Ani Nenkova,et al.  Easily Identifiable Discourse Relations , 2008, COLING.

[11]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[12]  Richard Johansson,et al.  Shallow Discourse Parsing with Conditional Random Fields , 2011, IJCNLP.

[13]  Ani Nenkova,et al.  Automatic sense prediction for implicit discourse relations in text , 2009, ACL.

[14]  Claire Cardie,et al.  Improving Implicit Discourse Relation Recognition Through Feature Set Optimization , 2012, SIGDIAL Conference.

[15]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[16]  Nello Cristianini,et al.  Controlling the Sensitivity of Support Vector Machines , 1999 .

[17]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[18]  Danushka Bollegala,et al.  A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension , 2010, EMNLP.

[19]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[20]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[21]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.