Topical Classification of Food Safety Publications with a Knowledge Base

The vast body of scientific publications presents an increasing challenge of finding those that are relevant to a given research question, and making informed decisions on their basis. This becomes extremely difficult without the use of automated tools. Here, one possible area for improvement is automatic classification of publication abstracts according to their topic. This work introduces a novel, knowledge baseoriented publication classifier. The proposed method focuses on achieving scalability and easy adaptability to other domains. Classification speed and accuracy are shown to be satisfactory, in the very demanding field of food safety. Further development and evaluation of the method is needed, as the proposed approach shows much potential.

[1]  P. Glasziou,et al.  Systematic review automation technologies , 2014, Systematic Reviews.

[2]  Laura A. Levit,et al.  Finding what works in health care : standards for systematic reviews , 2011 .

[3]  Mieke Uyttendaele,et al.  Challenges in Food Safety as Part of Food Security: Lessons Learnt on Food Safety in a Globalized World , 2016 .

[4]  Sanja Fidler,et al.  Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning , 2019, JMIR medical informatics.

[5]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[6]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[7]  N. Shah,et al.  NCBO Annotator: Semantic Annotation of Biomedical Data , 2009 .

[8]  Damion M. Dooley,et al.  FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration , 2018, npj Science of Food.

[9]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[10]  H. Pedersen,et al.  Dietary composition in Greenland 2000, plasma fatty acids and persistent organic pollutants. , 2004, The Science of the total environment.

[11]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[12]  Olaf Hartig,et al.  Foundations of an Alternative Approach to Reification in RDF , 2014, ArXiv.

[13]  Farimah HakemZadeh,et al.  Toward a theory of evidence based decision making , 2012 .

[14]  Daniel King,et al.  ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing , 2019, BioNLP@ACL.

[15]  Remigius Meier,et al.  Reflections on the compatibility, performance, and scalability of parallel Python , 2019, DLS.

[16]  Edoardo Aromataris,et al.  Constructing a search strategy and searching for evidence. A guide to the literature search for a systematic review. , 2014, The American journal of nursing.

[17]  Heng Ji,et al.  Entity linking for biomedical literature , 2014, DTMBIO '14.

[18]  Francesco Osborne,et al.  The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles , 2019, TPDL.

[19]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[20]  Xiaolong Wang,et al.  CNN-based ranking for biomedical entity normalization , 2017, BMC Bioinformatics.

[21]  I. Vågsholm,et al.  Food safety challenges and One Health within Europe , 2018, Acta Veterinaria Scandinavica.

[22]  Siddhartha R. Jonnalagadda,et al.  Automating data extraction in systematic reviews: a systematic review , 2015, Systematic Reviews.

[23]  Berry de Bruijn,et al.  Recognizing UMLS Semantic Types with Deep Learning , 2019, EMNLP.

[24]  Kevin Donnelly,et al.  SNOMED-CT: The advanced terminology and coding system for eHealth. , 2006, Studies in health technology and informatics.

[25]  David E. Irwin,et al.  Finding a "Kneedle" in a Haystack: Detecting Knee Points in System Behavior , 2011, 2011 31st International Conference on Distributed Computing Systems Workshops.

[26]  Siu Kwan Lam,et al.  Numba: a LLVM-based Python JIT compiler , 2015, LLVM '15.

[27]  Clement Jonquet,et al.  Investigating One Million XRefs in Thirthy Ontologies from the OBO World , 2020, ICBO/ODLS.

[28]  Shweta Yadav,et al.  Medical Knowledge-enriched Textual Entailment Framework , 2020, COLING.

[29]  Enrico Motta,et al.  The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas , 2018, SEMWEB.