Making species checklists understandable to machines – a shift from relational databases to ontologies

BackgroundThe scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications.ResultsWe introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time.ConclusionsThe use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more “intelligent” biological applications and services.

[1]  Trevor Paterson,et al.  Scientific Names Are Ambiguous as Identifiers for Biological Taxa: Their Context and Definition Are Required for Accurate Data Integration , 2005, DILS.

[2]  Stewart Bryant,et al.  Internet Engineering Task Force (IETF) , 2015 .

[3]  Vijay K. Gurbani,et al.  The Internet Assigned Number Authority (IANA) tel Uniform Resource Identifier (URI) Parameter Registry , 2008, RFC.

[4]  Nico M. Franz,et al.  BIOLOGICAL TAXONOMY AND ONTOLOGY DEVELOPMENT: SCOPE AND LIMITATIONS , 2010 .

[5]  Eero Hyvönen,et al.  Ontology Libraries for Production Use: The Finnish Ontology Library Service ONKI , 2009, ESWC.

[6]  Hans Silfverberg,et al.  Changes and additions to Enumeratio renovata Coleopterorum Fennoscandiae, Daniae et Baltiae , 2014 .

[7]  Andrew C. Jones,et al.  Identifying and relating biological concepts in the Catalogue of Life , 2011, J. Biomed. Semant..

[8]  Martin Boeker,et al.  The ontology of biological taxa , 2008, ISMB.

[9]  David L. Black,et al.  Black Request for Comments : 4088 EMC Corporation Category : Standards Track K , 2005 .

[10]  B. Vanhoorne,et al.  World Register of Marine Species , 2013 .

[11]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[12]  Claus Fischer,et al.  Towards a List of Available Names in Zoology, partim Phylum Rotifera , 2012 .

[13]  P. Kirk,et al.  International Code of Nomenclature for algae, fungi, and plants (Melbourne Code) , 2012 .

[14]  Phil Cryer Adoption of Persistent Identifiers for Biodiversity Informatics , 2010 .

[15]  M. Lane The Global Biodiversity Information Facility , 2005 .

[16]  Michel C. A. Klein,et al.  Concept drift and how to identify it , 2011, J. Web Semant..

[17]  Marie-France Plassard,et al.  Functional requirements for bibliographic records : final report , 2013 .

[18]  D J Patterson,et al.  Names are key to the big new biology. , 2010, Trends in ecology & evolution.

[19]  Jennifer Schaffner,et al.  A Beginner’s Guide to Persistent Identifiers , 2014 .

[20]  Walter G. Berendsohn,et al.  A taxonomic information model for botanical databases: the IOPI Model , 1997 .

[21]  Scott Federhen,et al.  The NCBI Taxonomy database , 2011, Nucleic Acids Res..

[22]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[23]  Roderic D. M. Page Taxonomic names, metadata, and the Semantic Web , 2006 .

[24]  Norman F. Johnson,et al.  Genera of the parasitoid wasp family Monomachidae (Hymenoptera: Diaprioidea) , 2012 .

[25]  R. Peet,et al.  Perspectives: Towards a language for mapping relationships among taxonomic concepts , 2009 .

[26]  Eero Hyvönen,et al.  Title : Biological Names and Taxonomies on the Semantic Web-Managing the Change in Scientific Conception Year : 2011 Version : Post print , 2018 .

[27]  G. Stoesser NCBI (National Center for Biotechnology Information) , 2004 .

[28]  Indra Neil Sarkar,et al.  Biodiversity informatics: organizing and linking information across the spectrum of life , 2007, Briefings Bioinform..

[29]  Erhard Rahm,et al.  GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution , 2011, J. Biomed. Semant..

[30]  Nico Cellinese,et al.  Evolutionary informatics: unifying knowledge about the diversity of life. , 2012, Trends in ecology & evolution.

[31]  Eero Hyvönen,et al.  Collaborative Metadata Editor Integrated with Ontology Services and Faceted Portals , 2010 .

[32]  Anton Güntsch,et al.  Biodiversity information standards (TDWG) , 2016 .

[33]  Peter Johan Lor,et al.  International Federation of Library Associations and Institutions , 1992 .

[34]  Walter G. Berendsohn,et al.  The concept of "potential taxa" in databases , 1995 .

[35]  John Wieczorek,et al.  Darwin Core: An Evolving Community-Developed Biodiversity Data Standard , 2012, PloS one.

[36]  Michelle Rucker,et al.  Encyclopedia of Life , 2014 .

[37]  Mathieu d'Aquin,et al.  Change management for metadata evolution , 2007 .

[38]  Sungyoung Lee,et al.  Change management in evolving web ontologies , 2013, Knowl. Based Syst..