Developing a vocabulary and ontology for modeling insect natural history data: example data, use cases, and competency questions

Abstract Insects are possibly the most taxonomically and ecologically diverse class of multicellular organisms on Earth. Consequently, they provide nearly unlimited opportunities to develop and test ecological and evolutionary hypotheses. Currently, however, large-scale studies of insect ecology, behavior, and trait evolution are impeded by the difficulty in obtaining and analyzing data derived from natural history observations of insects. These data are typically highly heterogeneous and widely scattered among many sources, which makes developing robust information systems to aggregate and disseminate them a significant challenge. As a step towards this goal, we report initial results of a new effort to develop a standardized vocabulary and ontology for insect natural history data. In particular, we describe a new database of representative insect natural history data derived from multiple sources (but focused on data from specimens in biological collections), an analysis of the abstract conceptual areas required for a comprehensive ontology of insect natural history data, and a database of use cases and competency questions to guide the development of data systems for insect natural history data. We also discuss data modeling and technology-related challenges that must be overcome to implement robust integration of insect natural history data.

[1]  Olaf Hartig,et al.  Foundations of RDF⋆ and SPARQL⋆ (An Alternative Approach to Statement-Level Metadata in RDF) , 2017, AMW.

[2]  John Wieczorek,et al.  Integrating and Managing Biodiversity Data with the Biocollections Ontology , 2018, Application of Semantic Technology in Biodiversity Science.

[3]  Anne Thessen,et al.  Challenges with using names to link digital biodiversity information , 2016, Biodiversity data journal.

[4]  Graeme Simsion,et al.  Data Modeling Essentials , 1994 .

[5]  Barry Smith,et al.  Semantics in Support of Biodiversity: An Introduction to the Biological Collections Ontology and Related Ontologies , 2014 .

[6]  Chris Mungall,et al.  Global biotic interactions: An open infrastructure to share and analyze species-interaction datasets , 2014, Ecol. Informatics.

[7]  Barry Smith,et al.  The environment ontology: contextualising biological and biomedical entities , 2013, Journal of Biomedical Semantics.

[8]  D. Grimaldi,et al.  Revision of the bizarre Mesozoic scorpionflies in the Pseudopolycentropodidae (Mecopteroidea) , 2005 .

[9]  D. Grimaldi,et al.  Evolution of the insects , 2005 .

[10]  Amit P. Sheth,et al.  Don't like RDF reification?: making statements about statements using singleton property , 2014, WWW.

[11]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[12]  Dean Allemang,et al.  Chapter 16 – Conclusions , 2011 .

[13]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[14]  S. Lewis,et al.  Uberon, an integrative multi-species anatomy ontology , 2012, Genome Biology.

[15]  Dean Allemang,et al.  Semantic Web for the Working Ontologist - Effective Modeling in RDFS and OWL, Second Edition , 2011 .

[16]  Alex Hardisty,et al.  UvA-DARE ( Digital Academic Repository ) A decadal view of biodiversity informatics : challenges and priorities , 2013 .

[17]  Chris Mungall,et al.  Nose to tail, roots to shoots: spatial descriptors for phenotypic diversity in the Biological Spatial Ontology , 2014, J. Biomed. Semant..

[18]  Eric Miller,et al.  An Introduction to the Resource Description Framework , 1998, D Lib Mag..

[19]  Chris Mungall,et al.  The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation , 2016, Journal of Biomedical Semantics.

[20]  Barry Smith,et al.  Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies , 2014, PloS one.

[21]  David Remsen,et al.  The use and limits of scientific names in biological informatics , 2016, ZooKeys.

[22]  Robert P. Guralnick,et al.  A Standardized Reference Data Set for Vertebrate Taxon Name Resolution , 2016, PloS one.

[23]  Lawrence M. Page,et al.  Digitization of Biodiversity Collections Reveals Biggest Data on Biodiversity , 2015 .

[24]  John Wieczorek,et al.  Darwin Core: An Evolving Community-Developed Biodiversity Data Standard , 2012, PloS one.

[25]  Aldo Gangemi,et al.  Ontology Design Patterns for Semantic Web Content , 2005, SEMWEB.

[26]  Mark S. Fox,et al.  The Role of Competency Questions in Enterprise Engineering , 1995 .

[27]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[28]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[29]  R. Peet,et al.  Perspectives: Towards a language for mapping relationships among taxonomic concepts , 2009 .

[30]  John J. Wiens,et al.  Inordinate Fondness Multiplied and Redistributed: the Number of Species on Earth and the New Pie of Life , 2017, The Quarterly Review of Biology.

[31]  Graeme Simsion Data Modeling Theory and Practice , 2007 .

[32]  Nico M. Franz,et al.  A logic approach to modelling nomenclatural change , 2018, Cladistics : the international journal of the Willi Hennig Society.

[33]  Robert Hoehndorf,et al.  The neurobehavior ontology: an ontology for annotation and integration of behavior and behavioral phenotypes. , 2012, International review of neurobiology.

[34]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[35]  P. Mayhew,et al.  Diet Evolution and Clade Richness in Hexapoda: A Phylogenetic Study of Higher Taxa , 2015, The American Naturalist.

[36]  Richard L Pyle,et al.  Towards a Global Names Architecture: The future of indexing scientific names , 2016, ZooKeys.

[37]  Robert Stevens,et al.  Towards Competency Question-Driven Ontology Authoring , 2014, ESWC.

[38]  Markus Krötzsch,et al.  Reifying RDF: What Works Well With Wikidata? , 2015, SSWS@ISWC.