A Vision for the Future Low-Temperature Geochemical Data-scape

Data sharing benefits the researcher, the scientific community, and most importantly, the public by enabling more impactful analysis of data and greater transparency in scientific research. However, like many other scientists, the low-temperature geochemistry (LTG) community has generally not developed protocols and standards for publishing, citing, and versioning datasets. This paper is the product of a group of LTG and data scientists convened to strategize about the future management of LTG data. The group observed that the current landscape of sites for LTG – the data-scape -- is a “street bazaar” of data repositories. This was deemed appropriate because LTG scientists target many different scientific questions and produce data with different structures and volumes described by copious and complex metadata. Nonetheless, the group agreed that publication of LTG science must be accompanied by sharing of data in publicly accessible repositories. To enable this for sample-based data, samples should be registered with globally unique persistent identifiers. LTG scientists should be able to use both highly structured databases designed for specialized types of data or generalized, unstructured, and non-targeted data repositories. The group strategized that the overall data informatics paradigm should shift from “build data repository, data will come” to “publish data online, cybertools will find.” In other words, the most important need within the growing and complex data-scape is for increasingly powerful tools for searching and cross-referencing data across the proliferating data repositories. This strategy requires increasing emphasis on data science and data management for LTG professionals and students.

[1]  Jeffery S. Horsburgh,et al.  Observations Data Model 2: A community information model for spatially discrete Earth observations , 2016, Environ. Model. Softw..

[2]  Clare L S Wiseman,et al.  Analytical methods for assessing metal bioaccessibility in airborne particulate matter: A scoping review. , 2015, Analytica chimica acta.

[3]  Gavin Sherlock,et al.  Funding high-throughput data sharing , 2004, Nature Biotechnology.

[4]  R. Reynolds,et al.  The NCEP/NCAR 40-Year Reanalysis Project , 1996, Renewable Energy.

[5]  Douglas A. Miller,et al.  An Ontology Driven Relational Geochemical Database for the Earth's Critical Zone: CZchemDB , 2014 .

[6]  Kunal Ghosh,et al.  Weathering of silicate minerals by organic acids II. Nature of residual products , 1994 .

[7]  Jennifer Wei,et al.  Creating Data Tool Kits That Everyone Can Use , 2020 .

[8]  L. M. Shuman,et al.  Selective Chemical Extraction of Soil Components and Bound Metal Species , 1981 .

[9]  Shreyas Cholia,et al.  Launching an Accessible Archive of Environmental Data , 2019, Eos.

[10]  Jeffery S. Horsburgh,et al.  Components of an environmental observatory information system , 2011, Comput. Geosci..

[11]  Lutz Breuer,et al.  Critical issues with cryogenic extraction of soil water for stable isotope analysis , 2016 .

[12]  R. P. Breckenridge,et al.  Determination of Background Concentrations of Inorganics in Soils and Sediments at Hazardous Waste Sites , 1998 .

[13]  Inez Y. Fung,et al.  Controls on solute concentration‐discharge relationships revealed by simultaneous hydrochemistry observations of hillslope runoff and stream flow: The importance of critical zone structure , 2017 .

[14]  Richard Han,et al.  Perspectives on next‐generation technology for environmental sensor networks , 2010 .

[15]  Denise Hanway Riedl,et al.  Quality assurance mechanisms for the unregulated research environment. , 2013, Trends in biotechnology.

[16]  Carol Henry,et al.  International Federation of Library Associations and Institutions , 1979 .

[17]  William E. Dietrich,et al.  Constitutive mass balance relations between chemical composition, volume, density, porosity, and strain in metasomatic hydrochemical systems: Results on weathering and pedogenesis , 1987 .

[18]  Daniel H. Rothman,et al.  Mineral protection regulates long-term global preservation of natural organic carbon , 2019, Nature.

[19]  Michael F. Hochella,et al.  Natural, incidental, and engineered nanomaterials and their impacts on the Earth system , 2019, Science.

[20]  M. Martone,et al.  A data citation roadmap for scientific publishers , 2017, Scientific Data.

[21]  Awwa,et al.  Standard Methods for the examination of water and wastewater , 1999 .

[22]  Maarten V. de Hoop,et al.  Machine learning for data-driven discovery in solid Earth geoscience , 2019, Science.

[23]  Tao Wen,et al.  Three Principles to Use in Streamlining Water Quality Research through Data Uniformity. , 2019, Environmental science & technology.

[24]  S. W. Christensen,et al.  Importance of Data Management in a Long-Term Biological Monitoring Program , 2011, Environmental management.

[25]  Candie C. Wilderman,et al.  Engaging over data on fracking and water quality , 2018, Science.

[26]  Sarah Callaghan,et al.  Joint declaration of data citation principles , 2014 .

[27]  Jeffery S. Horsburgh,et al.  Hydroserver: A Platform for Publishing Space-Time Hydrologic Datasets , 2010 .

[28]  Lingzhou Xue,et al.  Assessing changes in groundwater chemistry in landscapes with more than 100 years of oil and gas development. , 2019, Environmental science. Processes & impacts.

[29]  Abby J. Kinchy,et al.  Barriers to sharing water quality data: experiences from the Shale Network , 2017 .

[30]  R. Blom,et al.  A remote sensing approach to alteration mapping: AVIRIS data and extension-related potassium metasomatism, Socorro, New Mexico , 1997 .

[31]  Kristin Vanderbilt,et al.  Completing the data life cycle: using information management in macrosystems ecology research , 2014 .

[32]  Michael Fleischer Glossary of mineral species , 1987 .

[33]  Lynn Yarmey,et al.  Make scientific data FAIR , 2019, Nature.

[34]  Denise M. Argue,et al.  Challenges with secondary use of multi-source water-quality data in the United States. , 2017, Water research.

[35]  Division on Earth Assuring Data Quality at U.S. Geological Survey Laboratories , 2019 .