Annotation of Rhetorical Moves in Biochemistry Articles

This paper focuses on the real world application of scientific writing and on determining rhetorical moves, an important step in establishing the argument structure of biomedical articles. Using the observation that the structure of scholarly writing in laboratory-based experimental sciences closely follows laboratory procedures, we examine most closely the Methods section of the texts and adopt an approach of identifying rhetorical moves that are procedure-oriented. We also propose a verb-centric frame semantics with an effective set of semantic roles in order to support the analysis. These components are designed to support a computational model that extends a promising proposal of appropriate rhetorical moves for this domain, but one which is merely descriptive. Our work also contributes to the understanding of argument-related annotation schemes. In particular, we conduct a detailed study with human annotators to confirm that our selection of semantic roles is effective in determining the underlying rhetorical structure of existing biomedical articles in an extensive dataset. The annotated dataset that we produce provides the important knowledge needed for our ultimate goal of analyzing biochemistry articles.

[1]  Nigel Collier,et al.  Zone analysis in biology articles as a basis for information extraction , 2006, Int. J. Medical Informatics.

[2]  Olga Gladkova Identification of epistemic topoi in a corpus of biomedical research articles , 2011 .

[3]  Simone Teufel,et al.  Scientific Argumentation Detection as Limited-domain Intention Recognition , 2014, ArgNLP.

[4]  Marilyn A. Walker,et al.  A Corpus for Research on Deliberation and Debate , 2012, LREC.

[5]  Rodney F. Boyer,et al.  Biochemistry Laboratory: Modern Theory and Techniques , 2006 .

[6]  Karin Baier,et al.  The Uses Of Argument , 2016 .

[7]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[8]  Peter Uetz,et al.  The FF domains of yeast U1 snRNP protein Prp40 mediate interactions with Luc7 and Snu71 , 2008, BMC Biochemistry.

[9]  Goran Glavas,et al.  An Argument-Annotated Corpus of Scientific Publications , 2018, ArgMining@EMNLP.

[10]  Sophia Ananiadou,et al.  Building a Bio-Event Annotated Corpus for the Acquisition of Semantic Frames from Biomedical Corpora , 2008, LREC.

[11]  John M. Swales,et al.  Genre Analysis: English in Academic and Research Settings , 1993 .

[12]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[13]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[14]  Ch. Perelman,et al.  The New Rhetoric: A Treatise on Argumentation , 1971 .

[15]  Simone Teufel,et al.  The Structure of Scientific Articles - Applications to Citation Indexing and Summarization , 2010, CSLI Studies in Computational Linguistics.

[16]  Nancy Green,et al.  Identifying Argumentation Schemes in Genetics Research Articles , 2015, ArgMining@HLT-NAACL.

[17]  Jean Carletta,et al.  An annotation scheme for discourse-level argumentation in research articles , 1999, EACL.

[18]  Dietrich Rebholz-Schuhmann,et al.  Automatic recognition of conceptualization zones in scientific articles and two life science applications , 2012, Bioinform..

[19]  Douglas Biber,et al.  Variation across speech and writing: Methodology , 1988 .

[20]  J. Rihel,et al.  An extended family of novel vertebrate photopigments is widely expressed and displays a diversity of function , 2015, Genome research.

[21]  L. Symington,et al.  Mre11-Sae2 and RPA Collaborate to Prevent Palindromic Gene Amplification. , 2015, Molecular cell.

[22]  Simone Teufel Towards Discipline-Independent Argumentative Zoning : Evidence from Chemistry and Computational Linguistics , 2009 .

[23]  Nancy Green Towards Creation of a Corpus for Argumentation Mining the Biomedical Genetics Research Literature , 2014, ArgMining@ACL.

[24]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[25]  J. Sambrook,et al.  Molecular Cloning: A Laboratory Manual , 2001 .

[26]  Sophia Ananiadou,et al.  Enriching a biomedical event corpus with meta-knowledge annotation , 2011, BMC Bioinformatics.

[27]  B. Kanoksilapatham Rhetorical structure of biochemistry research articles , 2005 .

[28]  C. Fillmore FRAME SEMANTICS AND THE NATURE OF LANGUAGE * , 1976 .

[29]  Zhiyong Lu,et al.  PubMed and beyond: a survey of web tools for searching biomedical literature , 2011, Database J. Biol. Databases Curation.

[30]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[31]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[32]  K. Cohen,et al.  Biomedical language processing: what's beyond PubMed? , 2006, Molecular cell.

[33]  Simone Teufel,et al.  Towards Domain-Independent Argumentative Zoning: Evidence from Chemistry and Computational Linguistics , 2009, EMNLP.