论文信息 - CODA-19: Reliably Annotating Research Aspects on 10,000+ CORD-19 Abstracts Using a Non-Expert Crowd - 字舞流文

CODA-19: Reliably Annotating Research Aspects on 10,000+ CORD-19 Abstracts Using a Non-Expert Crowd

This paper introduces CODA-19, a human-annotated dataset that codes the Background, Purpose, Method, Finding/Contribution, and Other sections of 10,966 English abstracts in the COVID-19 Open Research Dataset. CODA-19 was created by 248 crowd workers from Amazon Mechanical Turk within 10 days, achieving a label quality comparable to that of experts. Each abstract was annotated by nine different workers, and the final labels were obtained by majority vote. The inter-annotator agreement (Cohen's kappa) between the crowd and the biomedical expert (0.741) is comparable to inter-expert agreement (0.788). CODA-19's labels have an accuracy of 82.2% when compared to the biomedical expert's labels, while the accuracy between experts was 85.0%. Reliable human annotations help scientists to understand the rapidly accelerating coronavirus literature and also serve as the battery of AI/NLP research, but obtaining expert annotations can be slow. We demonstrated that a non-expert crowd can be rapidly employed at scale to join the fight against COVID-19.

C. Lee Giles | Chieh-Yang Huang | Yen-Chia Hsu | Ting-Hao 'Kenneth' Huang | Chien-Kuang Cornelia Ding | Ting-Hao Kenneth Huang | Yen-Chia Hsu | Chien-Kuang Cornelia Ding | Chieh-Yang Huang

[1] Simone Teufel,et al. Argumentative zoning information extraction from scientific text , 1999 .

[2] James Hartley,et al. Current findings from research on structured abstracts. , 2004, Journal of the Medical Library Association : JMLA.

[3] Nigel Collier,et al. Zone analysis in biology articles as a basis for information extraction , 2006, Int. J. Medical Informatics.

[4] Hagit Shatkay,et al. New directions in biomedical text annotation: definitions, guidelines and corpus construction , 2006, BMC Bioinformatics.

[5] Simon Buckingham Shum,et al. Hypotheses, evidence and relationships: The HypER approach for representing scientific knowledge claims , 2009, ISWC 2009.

[6] Simone Teufel,et al. Corpora for the Conceptualisation and Zoning of Scientific Papers , 2010, LREC.

[7] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8] Anita de Waard,et al. Verb Form Indicates Discourse Segment Type in Biological Research Papers: Experimental Evidence. , 2012 .

[9] Eva Siegenthaler,et al. Reading on LCD vs e‐Ink displays: effects on fatigue and visual strain , 2012, Ophthalmic & physiological optics : the journal of the British College of Ophthalmic Opticians.

[10] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[11] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[12] Benjamin M. Good,et al. Microtask Crowdsourcing for Disease Mention Annotation in PubMed Abstracts , 2014, Pacific Symposium on Biocomputing.

[13] Behrang Q. Zadeh,et al. The ACL RD-TEC 2.0: A Language Resource for Evaluating Term Extraction and Entity Recognition Methods , 2016, LREC.

[14] Zhiyong Lu,et al. Crowdsourcing in biomedicine: challenges and opportunities , 2016, Briefings Bioinform..

[15] Eduard H. Hovy,et al. Experiment Segmentation in Scientific Discourse as Clause-level Structured Prediction using Recurrent Neural Networks , 2017, ArXiv.

[16] Franck Dernoncourt,et al. PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts , 2017, IJCNLP.

[17] Isabelle Augenstein,et al. SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications , 2017, *SEMEVAL.

[18] Hsin-Hsi Chen,et al. DISA: A Scientific Writing Advisor with Deep Information Structure Analysis , 2017, IJCAI.

[19] Keno März,et al. Large-scale medical image annotation with crowd-powered algorithms , 2018, Journal of medical imaging.

[20] Behrang Q. Zadeh,et al. SemEval-2018 Task 7: Semantic Relation Extraction and Classification in Scientific Papers , 2018, *SEMEVAL.

[21] Halil Kilicoglu,et al. Biomedical Text Mining for Research Rigor and Integrity: Tasks, Challenges, Directions , 2017, bioRxiv.

[22] Dafna Shahaf,et al. 31 SOLVENT : A Mixed Initiative System for Finding Analogies between Research Papers , 2018 .

[23] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[24] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[25] Iz Beltagy,et al. SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[26] Heng Ji,et al. PaperRobot: Incremental Draft Generation of Scientific Ideas , 2019, ACL.

[27] Danushka Bollegala,et al. Correcting Crowdsourced Annotations to Improve Detection of Outcome Types in Evidence Based Medicine , 2019, KHD@IJCAI.

[28] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29] Donghui Li,et al. MedMentions: A Large Biomedical Corpus Annotated with UMLS Concepts , 2019, AKBC.

[30] Jonathan L. McMurry,et al. Kinetic Analysis of Bacteriophage Sf6 Binding to Outer Membrane Protein A Using Whole Virions , 2019, bioRxiv.

[31] Oren Etzioni,et al. CORD-19: The Covid-19 Open Research Dataset , 2020, NLPCOVID19.

[32] Vincent A. Traag,et al. A scientometric overview of CORD-19 , 2020, bioRxiv.

[33] Debarshi Kumar Sanyal,et al. Segmenting Scientific Abstracts into Discourse Categories: A Deep Learning-Based Approach for Sparse Labeled Data , 2020, JCDL.