IMHO Fine-Tuning Improves Claim Detection

Claims are the central component of an argument. Detecting claims across different domains or data sets can often be challenging due to their varying conceptualization. We propose to alleviate this problem by fine-tuning a language model using a Reddit corpus of 5.5 million opinionated claims. These claims are self-labeled by their authors using the internet acronyms IMO/IMHO (in my (humble) opinion). Empirical results show that using this approach improves the state of art performance across four benchmark argumentation data sets by an average of 4 absolute F1 points in claim detection. As these data sets include diverse domains such as social media and student essays this improvement demonstrates the robustness of fine-tuning on this novel corpus.

[1]  Vincent Ng,et al.  End-to-End Argumentation Mining in Student Essays , 2016, NAACL.

[2]  Elena Musi,et al.  Analyzing the Semantic Types of Claims and Premises in an Online Persuasive Forum , 2017, ArgMining@EMNLP.

[3]  Iryna Gurevych,et al.  Argumentation Mining in User-Generated Web Discourse , 2016, CL.

[4]  Iryna Gurevych,et al.  Parsing Argumentation Structures in Persuasive Essays , 2016, CL.

[5]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[6]  Sara Rosenthal,et al.  Detecting Opinionated Claims in Online Discussions , 2012, 2012 IEEE Sixth International Conference on Semantic Computing.

[7]  Yann LeCun,et al.  GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations , 2018, ArXiv.

[8]  A. Peldszus An Annotated Corpus of Argumentative Microtexts , 2015 .

[9]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10]  Matthias Hagen,et al.  Cross-Domain Mining of Argumentative Text through Distant Supervision , 2016, NAACL.

[11]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[12]  Iryna Gurevych,et al.  What is the Essence of a Claim? Cross-Domain Claim Identification , 2017, EMNLP.

[13]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[14]  Karin Baier,et al.  The Uses Of Argument , 2016 .

[15]  Iryna Gurevych,et al.  Neural End-to-End Learning for Computational Argumentation Mining , 2017, ACL.

[16]  T. Govier A practical study of argument , 1985 .