Mining arguments in scientific abstracts with discourse-level embeddings

Abstract Argument mining consists in the automatic identification of argumentative structures in texts. In this work we leverage existing discourse-level annotations to facilitate the identification of argumentative components and relations in scientific texts, which has been recognized as a particularly challenging task. We propose a new annotation schema and use it to augment a corpus of computational linguistics abstracts that had previously been annotated with discourse units and relations. Our initial experiments with the enriched corpus confirm the potential value of incorporating discourse information in argument mining tasks. In order to tackle the limitations posed by the lack of corpora containing both discourse and argumentative annotations we explore two transfer learning approaches in which discourse parsing is used as an auxiliary task when training argument mining models. In this case, as no discourse information is used as input, the resulting models could be used to predict the argumentative structure of unannotated texts.

[1]  Barbara Plank,et al.  When is multitask learning effective? Semantic sequence prediction under varying data conditions , 2016, EACL.

[2]  Simone Teufel,et al.  Corpora for the Conceptualisation and Zoning of Scientific Papers , 2010, LREC.

[3]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[4]  S. Toulmin The uses of argument , 1960 .

[5]  An Yang,et al.  SciDTB: Discourse Dependency TreeBank for Scientific Abstracts , 2018, ACL.

[6]  Christopher Joseph Pal,et al.  Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.

[7]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[8]  Iryna Gurevych,et al.  Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.

[9]  Paolo Torroni,et al.  Argumentation Mining , 2016, ACM Trans. Internet Techn..

[10]  Carol Tenopir,et al.  Seeking, Reading, and Use of Scholarly Articles: An International Study of Perceptions and Behavior of Researchers , 2019, Publ..

[11]  Iryna Gurevych,et al.  Parsing Argumentation Structures in Persuasive Essays , 2016, CL.

[12]  Maria Liakata,et al.  Semantic Annotation of Papers: Interface & Enrichment Tool (SAPIENT) , 2009, BioNLP@HLT-NAACL.

[13]  Wei Zhang,et al.  The Effect of Task Similarity on Deep Transfer Learning , 2017, ICONIP.

[14]  K. Hyland,et al.  Hedging in scientific research articles , 1998 .

[15]  Marie-Francine Moens,et al.  Argumentation mining: the detection, classification and structure of arguments in text , 2009, ICAIL.

[16]  Horacio Saggion,et al.  Transferring Knowledge from Discourse to Arguments: A Case Study with Scientific Abstracts , 2019, ArgMining@ACL.

[17]  Sebastian Ruder,et al.  Neural transfer learning for natural language processing , 2019 .

[18]  Serena Villata,et al.  From Discourse Analysis to Argumentation Schemes and Back: Relations and Differences , 2013, CLIMA.

[19]  Iryna Gurevych,et al.  Multi-Task Learning for Argumentation Mining in Low-Resource Settings , 2018, NAACL.

[20]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[21]  Simone Teufel,et al.  Towards Domain-Independent Argumentative Zoning: Evidence from Chemistry and Computational Linguistics , 2009, EMNLP.

[22]  Nicholas Asher,et al.  How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT , 2017, EMNLP.

[23]  Simone Teufel,et al.  Argumentative zoning information extraction from scientific text , 1999 .

[24]  Luke S. Zettlemoyer,et al.  Dissecting Contextual Word Embeddings: Architecture and Representation , 2018, EMNLP.

[25]  Lutz Bornmann,et al.  Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references , 2014, J. Assoc. Inf. Sci. Technol..

[26]  Iryna Gurevych,et al.  Argumentation Mining in User-Generated Web Discourse , 2016, CL.

[27]  William C. Mann,et al.  Rhetorical structure theory and text analysis , 1989 .

[28]  Iryna Gurevych,et al.  Annotating Argument Components and Relations in Persuasive Essays , 2014, COLING.

[29]  Jason Baldridge,et al.  Hierarchical Discriminative Classification for Text-Based Geolocation , 2014, EMNLP.

[30]  Marcin Koszowy,et al.  Argumentation in the 2016 US presidential elections: annotated corpora of television debates and social media reaction , 2019, Language Resources and Evaluation.

[31]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[32]  Vangelis Karkaletsis,et al.  Argument Extraction from News, Blogs, and Social Media , 2014, SETN.

[33]  Pythagoras Karampiperis,et al.  Argument extraction for supporting public policy formulation , 2013, LaTeCH@ACL.

[34]  Dietrich Rebholz-Schuhmann,et al.  Automatic recognition of conceptualization zones in scientific articles and two life science applications , 2012, Bioinform..

[35]  Horacio Saggion,et al.  Discourse-Driven Argument Mining in Scientific Abstracts , 2019, NLDB.

[36]  Marie-Francine Moens,et al.  Argumentation Mining: Where are we now, where do we want to be and how do we get there? , 2013, FIRE.

[37]  Marie-Francine Moens,et al.  Argumentation mining , 2011, Artificial Intelligence and Law.

[38]  Chris Reed,et al.  Argument Mining: A Survey , 2020, Computational Linguistics.

[39]  Owen Rambow,et al.  Identifying Justifications in Written Dialogs by Classifying Text as Argumentative , 2011, Int. J. Semantic Comput..

[40]  Dragomir R. Radev,et al.  The ACL anthology network corpus , 2009, Language Resources and Evaluation.

[41]  Chris Reed,et al.  Araucaria: Software for Argument Analysis, Diagramming and Representation , 2004, Int. J. Artif. Intell. Tools.