Multi2Claim: Generating Scientific Claims from Multi-Choice Questions for Scientific Fact-Checking

Training machine learning models to successfully perform scientific fact-checking tasks is challenging due to the expertise bottleneck that limits the availability of appropriate training datasets. In this task, models use textual evidence to confirm scientific claims, which requires data that contains extensive domain-expert annotation. Consequently, the number of existing scientific-fact-checking datasets and the sizes of those datasets are limited. However, these limitations do not apply to multiple-choice question datasets because of the necessity of domain exams in the modern education system. As one of the first steps towards addressing the fact-checking dataset scarcity problem in scientific domains, we propose a pipeline for automatically converting multiple-choice questions into fact-checking data, which we call Multi2Claim. By applying the proposed pipeline, we generated two large-scale datasets for scientific-fact-checking tasks: Med-Fact and Gsci-Fact for the medical and general science domains, respectively. These two datasets are among the first examples of large-scale scientific-fact-checking datasets. We developed baseline models for the verdict prediction task using each dataset. Additionally, we demonstrated that the datasets could be used to improve performance with respect to the F 1 weighted metric on existing fact-checking datasets such as SciFact, HEALTHVER, COVID-Fact, and CLIMATE-FEVER. In some cases, the improvement in performance was up to a 26% increase.

[1]  Ankit Pal,et al.  MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering , 2022, CHIL.

[2]  Lucy Lu Wang,et al.  Generating Scientific Claims for Zero-Shot Scientific Fact Checking , 2022, ACL.

[3]  Smaranda Muresan,et al.  COVID-Fact: Fact Extraction and Verification of Real-World Claims on COVID-19 Pandemic , 2021, ACL.

[4]  Wenhan Xiong,et al.  Zero-shot Fact Verification by Claim Generation , 2021, ACL.

[5]  Carl T. Bergstrom,et al.  Misinformation in and about science , 2021, Proceedings of the National Academy of Sciences.

[6]  Hao Zhou,et al.  LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification , 2020, AAAI.

[7]  Jannis Bulian,et al.  CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims , 2020, ArXiv.

[8]  Neema Kotonya,et al.  Explainable Automated Fact-Checking: A Survey , 2020, COLING.

[9]  Jimmy J. Lin,et al.  Pretrained Transformers for Text Ranking: BERT and Beyond , 2020, NAACL.

[10]  Jon Roozenbeek,et al.  Susceptibility to misinformation about COVID-19 around the world , 2020, Royal Society Open Science.

[11]  Giuseppe Serra,et al.  Distilling the Evidence to Augment Fact Verification Models , 2020, FEVER.

[12]  Durgesh Nandini,et al.  FakeCovid - A Multilingual Cross-domain Fact Check News Dataset for COVID-19 , 2020, ICWSM Workshops.

[13]  Jianfeng Gao,et al.  DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.

[14]  Preslav Nakov,et al.  That is a Known Lie: Detecting Previously Fact-Checked Claims , 2020, ACL.

[15]  Hannaneh Hajishirzi,et al.  Fact or Fiction: Verifying Scientific Claims , 2020, EMNLP.

[16]  Daniel S. Weld,et al.  SPECTER: Document-level Representation Learning using Citation-informed Transformers , 2020, ACL.

[17]  Zhenghao Liu,et al.  Coreferential Reasoning Learning for Language Representation , 2020, EMNLP.

[18]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[19]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[20]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[21]  Chenyan Xiong,et al.  Fine-grained Fact Verification with Kernel Graph Attention Network , 2019, ACL.

[22]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[23]  Marcel Worring,et al.  BERT for Evidence Retrieval and Claim Verification , 2019, ECIR.

[24]  Christian Hansen,et al.  MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims , 2019, EMNLP.

[25]  Maosong Sun,et al.  GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification , 2019, ACL.

[26]  Niloy Ganguly,et al.  AttentiveChecker: A Bi-Directional Attention Flow Mechanism for Fact Verification , 2019, NAACL.

[27]  Iz Beltagy,et al.  SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[28]  Daniel King,et al.  ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing , 2019, BioNLP@ACL.

[29]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[30]  Haonan Chen,et al.  Combining Fact Extraction and Verification with Neural Semantic Matching Networks , 2018, AAAI.

[31]  Jing Qian,et al.  A Survey on Natural Language Processing for Fake News Detection , 2018, LREC.

[32]  Percy Liang,et al.  Transforming Question Answering Datasets Into Natural Language Inference Datasets , 2018, ArXiv.

[33]  Andrew McCallum,et al.  Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking , 2018, ACL.

[34]  Rachel Rudinger,et al.  Hypothesis Only Baselines in Natural Language Inference , 2018, *SEMEVAL.

[35]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[36]  Preslav Nakov,et al.  Fact Checking in Community Forums , 2018, AAAI.

[37]  Nelson F. Liu,et al.  Crowdsourcing Multiple Choice Science Questions , 2017, NUT@EMNLP.

[38]  Isabelle Augenstein,et al.  A simple but tough-to-beat baseline for the Fake News Challenge stance detection task , 2017, ArXiv.

[39]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[40]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[41]  Petroc Sumner,et al.  The association between exaggeration in health related science news and academic press releases: retrospective observational study , 2014, BMJ : British Medical Journal.

[42]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[43]  Michael Krauthammer,et al.  Broadening the Scope of Nanopublications , 2013, ESWC.

[44]  Steven Woloshin,et al.  Press Releases by Academic Medical Centers: Not So Academic? , 2009, Annals of Internal Medicine.

[45]  Steven Woloshin,et al.  Press releases: translating research into news. , 2002, JAMA.

[46]  Iryna Gurevych,et al.  Evidence-based Verification for Real World Information Needs , 2021, ArXiv.

[47]  Dina Demner-Fushman,et al.  Evidence-based Fact-Checking of Health-related Claims , 2021, EMNLP.

[48]  Daniel S. Weld,et al.  S2ORC: The Semantic Scholar Open Research Corpus , 2020, ACL.

[49]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[50]  Dmitry Ilvovsky,et al.  Extract and Aggregate: A Novel Domain-Independent Approach to Factual Data Verification , 2019, EMNLP.

[51]  Smaranda Muresan,et al.  Where is Your Evidence: Improving Fact-checking by Justification Modeling , 2018 .