Discourse Relation Sense Classification Using Cross-argument Semantic Similarity Based on Word Embeddings

This paper describes our system for the CoNLL 2016 Shared Task’s supplementary task on Discourse Relation Sense Classification. Our official submission employs a Logistic Regression classifier with several cross-argument similarity features based on word embeddings and performs with overall F-scores of 64.13 for the Dev set, 63.31 for the Test set and 54.69 for the Blind set, ranking first in the Overall ranking for the task. We compare the feature-based Logistic Regression classifier to different Convolutional Neural Network architectures. After the official submission we enriched our model for Non-Explicit relations by including similarities of explicit connectives with the relation arguments, and part of speech similarities based on modal verbs. This improved our Non-Explicit result by 1.46 points on the Dev set and by 0.36 points on the Blind set.

[1]  Pascal Denis,et al.  Comparing Word Representations for Implicit Discourse Relation Classification , 2015, EMNLP.

[2]  Hwee Tou Ng,et al.  The CoNLL-2015 Shared Task on Shallow Discourse Parsing , 2015, CoNLL.

[3]  Ani Nenkova,et al.  Easily Identifiable Discourse Relations , 2008, COLING.

[4]  Evgeny A. Stepanov,et al.  The UniTN Discourse Parser in CoNLL 2015 Shared Task: Token-level Sequence Labeling with Argument-specific Models , 2015, CoNLL.

[5]  Preslav Nakov,et al.  SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings , 2016, *SEMEVAL.

[6]  Man Lan,et al.  A Refined End-to-End Discourse Parser , 2015, CoNLL Shared Task.

[7]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[8]  Preslav Nakov,et al.  MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering? , 2016, *SEMEVAL.

[9]  Hwee Tou Ng,et al.  CoNLL 2016 Shared Task on Multilingual Shallow Discourse Parsing , 2016, CoNLL.

[10]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11]  Yaojie Lu,et al.  Shallow Convolutional Neural Network for Implicit Discourse Relation Recognition , 2015, EMNLP.

[12]  Alex Lascarides,et al.  Edinburgh Research Explorer Using automatically labelled examples to classify rhetorical relations: an assessment , 2022 .

[13]  Stefan Riezler,et al.  QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation , 2015, WMT@EMNLP.

[14]  Christian Chiarcos,et al.  A Minimalist Approach to Shallow Discourse Parsing and Implicit Relation Recognition , 2015, CoNLL Shared Task.

[15]  Quan Hung Tran,et al.  JAIST: Combining multiple features for Answer Selection in Community Question Answering , 2015, *SEMEVAL.

[16]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[17]  Nianwen Xue,et al.  Improving the Inference of Implicit Discourse Relations via Classifying Explicit Discourse Connectives , 2015, NAACL.

[18]  Daniel Marcu,et al.  An Unsupervised Approach to Recognizing Discourse Relations , 2002, ACL.

[19]  Sobha Lalitha Devi,et al.  A Hybrid Discourse Relation Parser in CoNLL 2015 , 2015, CoNLL Shared Task.

[20]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[21]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[22]  Nianwen Xue,et al.  Discovering Implicit Discourse Relations Through Brown Cluster Pair Representation and Coreference Patterns , 2014, EACL.

[23]  Benno Stein,et al.  Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling , 2014, CLEF.

[24]  Preslav Nakov,et al.  Machine Translation Evaluation Meets Community Question Answering , 2016, ACL.

[25]  Fang Kong,et al.  The SoNLP-DP System in the CoNLL-2015 shared Task , 2015, CoNLL.

[26]  Anette Frank,et al.  Multilingual Modal Sense Classification using a Convolutional Neural Network , 2016, Rep4NLP@ACL.

[27]  Djoerd Hiemstra,et al.  Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics , 2012, Lecture Notes in Computer Science.

[28]  Yang Liu,et al.  Implicit Discourse Relation Classification via Multi-Task Neural Networks , 2016, AAAI.

[29]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.