A conceptual model for data management in the field of ecology

Abstract Conceptual models play an important role in identifying the domain under study and establishing an interoperability framework between different scientific groups and tools working on the same or neighboring domains. The importance comes from the fact that the conceptual models describe the target domain in a technology agnostic manner, using domain terminology, considerations, and rules. In this paper we introduce a highly flexible data and metadata structure for biodiversity (and related fields) information management. The model incorporates important concepts needed to develop a proper domain model for managing biodiversity data, e.g., data, data structure, metadata, metadata structure, and semantic descriptions of model elements. The model is designed in UML using the object oriented analysis paradigms. The data management teams of several large collaborative projects as well as those of two research institutes were actively cooperating in the design of the model, thus ensuring that all aspects relevant for these very different projects and institutions are considered and that a high acceptance of the model will ensue. The model supports and encourages reuse and sharing of different elements, making the cross dataset syntheses, comparison, merging and searches easier. The incorporated semantic package helps to annotate dataset's variables and metadata attributes by means of ontologies, taxonomies or thesauri. These annotations can be used for standardization, localization and also for managing the variety of meanings of same or similar variables among community members. The model is currently undergoing its implementation phase and will replace the model used in the current version of BExIS, a data management platform for biodiversity research, when finished.

[1]  Matthew B. Jones,et al.  Challenges and Opportunities of Open Data in Ecology , 2011, Science.

[2]  Shawn Bowers,et al.  An ontology for describing and synthesizing ecological observation data , 2007, Ecol. Informatics.

[3]  J. Müller,et al.  Saproxylic beetles as indicator species for dead-wood amount and temperature in European beech forests , 2012 .

[4]  S. Higgins,et al.  TRY – a global database of plant traits , 2011, Global Change Biology.

[5]  Matthew B Jones,et al.  Ecoinformatics: supporting ecology as a data-intensive science. , 2012, Trends in ecology & evolution.

[6]  Matthew B. Jones,et al.  Metacat: a schema-independent XML database system , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[7]  Marc Mangel,et al.  Accelerate Synthesis in Ecology and Environmental Sciences , 2009 .

[8]  Peter Ingwersen,et al.  Towards a data publishing framework for primary biodiversity data: challenges and potentials for the biodiversity informatics community , 2009, BMC Bioinformatics.

[9]  Shyam Reyal,et al.  Scientific Data Management , 2015 .

[10]  Joshua S Madin,et al.  A generic structure for plant trait databases , 2011 .

[11]  M. Schloter,et al.  General Relationships between Abiotic Soil Properties and Soil Biota across Spatial Scales and Different Land-Use Types , 2012, PloS one.

[12]  Martin Fowler,et al.  Analysis patterns - reusable object models , 1996, Addison-Wesley series in object-oriented software engineering.

[13]  Birgitta König-Ries,et al.  Diverse or uniform? - Intercomparison of two major German project databases for interdisciplinary collaborative functional biodiversity research , 2012, Ecol. Informatics.

[14]  Christian Wirth,et al.  Identifiers in e-Science platforms for the ecological sciences , 2012, GeNeMe.