Ten best practices to strengthen stewardship and sharing of water science data in Canada

Water science data are a valuable asset that both underpins the original research project and bolsters new research questions, particularly in view of the increasingly complex water issues facing Canada and the world. Whilst there is general support for making data more broadly accessible, and a number of water science journals and funding agencies have adopted policies that require researchers to share data in accordance with the findable, accessible, interoperable, reusable (FAIR) principles, there are still questions about effective management of data to protect their usefulness over time. Incorporating data management practices and standards at the outset of a water science research project will enable researchers to efficiently locate, analyse and use data throughout the project lifecycle, and will ensure the data maintain their value after the project has ended. Here, some common misconceptions about data management are highlighted, along with insights and practical advice to assist established and early career water science researchers as they integrate data management best practices and tools into their research. Freely available tools and training opportunities made available in Canada through Global Water Futures, The Gordon Foundation DataStream, the Digital Research Alliance of Canada Portage Network, Compute Canada, and university libraries, among others are compiled. These include webinars, training videos, and individual support for the water science community that together enable researchers to protect their data assets and meet the expectations of journals and funders. The perspectives shared here have been developed as part of the Global Water Futures programme's efforts to improve data management and promote the use of common data practices and standards in the context of water science in Canada. Ten best practices are proposed that may be broadly applicable to other disciplines in the natural sciences and can be adopted and adapted globally.

[1]  Kristian Gerner,et al.  Opening Pandora's box , 2022, Ideology and Rationality in the Soviet Model.

[2]  Dominique G. Roche,et al.  The quality of open datasets shared by researchers in ecology and evolution is moderately repeatable and slow to change , 2021 .

[3]  Dominique G. Roche,et al.  Towards open, reliable, and transparent ecology and evolutionary biology , 2021, BMC biology.

[4]  Dominique G. Roche,et al.  Avoiding wasted research resources in conservation science , 2021, Conservation Science and Practice.

[5]  Brian A. Nosek,et al.  The State of Open Data 2022 , 2020, Septentrio Conference Series.

[6]  Dominique G. Roche,et al.  Open government data and environmental science: a federal Canadian perspective , 2020 .

[7]  M. Hudson,et al.  The CARE Principles for Indigenous Data Governance , 2020, Data Sci. J..

[8]  C. Wellen,et al.  An analysis of the sample size requirements for acceptable statistical power in water quality monitoring for improvement detection , 2020 .

[9]  Kathy Szigeti,et al.  Data Management Plan for Ecohydrology Research Group (Exemplar) , 2020 .

[10]  Abigail Goben,et al.  Foundational Practices of Research Data Management , 2020, Research Ideas and Outcomes.

[11]  A. Culina,et al.  Low availability of code in ecology: A call for urgent action , 2020, PLoS biology.

[12]  D. Hurley COVID-19 Closes Labs, Slows Data Collection—The Impact on Neurology Research , 2020 .

[13]  Dominique G. Roche,et al.  A Novel Framework to Protect Animal Data in a World of Ecosurveillance , 2020, BioScience.

[14]  Martin Gauch,et al.  Machine Learning for Streamflow Prediction , 2020 .

[15]  Juliane Mai,et al.  The Canadian Surface Prediction Archive (CaSPAr): A Platform to Enhance Environmental Modeling in Canada and Globally , 2020, Bulletin of the American Meteorological Society.

[16]  Bouchra R. Nasri,et al.  A new flow for Canadian young hydrologists: Key scientific challenges addressed by research cultural shifts , 2020, Hydrological Processes.

[17]  D. Ellis,et al.  Opening Pandora’s Box: Peeking inside Psychology’s data sharing practices, and seven recommendations for change , 2019, Behavior Research Methods.

[18]  Halley E. Froehlich,et al.  Supercharge your research: a ten-week plan for open data science. , 2019, Nature.

[19]  Paul Ayris,et al.  The State of Open Data Report 2019 , 2019 .

[20]  M. Woo Cryohydrology in Canada: A brief history , 2019, Hydrological Processes.

[21]  R. Stewart,et al.  Summary and synthesis of Changing Cold Regions Network (CCRN) research in the interior of western Canada – Part 1: Projected climate and meteorology , 2019, Hydrology and Earth System Sciences.

[22]  Galina Dick,et al.  A Standardized Atmospheric Measurement Data Archive for Distributed Cloud and Precipitation Process-Oriented Observations in Central Europe , 2019, Bulletin of the American Meteorological Society.

[23]  James H Stagge,et al.  Assessing data availability and research reproducibility in hydrology and water resources , 2019, Scientific Data.

[24]  J. Pomeroy,et al.  A long-term hydrometeorological dataset (1993–2014) of a northern mountain basin: Wolf Creek Research Basin, Yukon Territory, Canada , 2019, Earth System Science Data.

[25]  Hilary K. McMillan,et al.  Hydrological data uncertainty and its implications , 2018, WIREs Water.

[26]  Michael C. Frank,et al.  Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition , 2018, Royal Society Open Science.

[27]  Eren Turak,et al.  A decision tree for assessing the risks and benefits of publishing biodiversity data , 2018, Nature Ecology & Evolution.

[28]  M. Weiler,et al.  Incentives for field hydrology and data sharing: collaboration and compensation: reply to “A need for incentivizing field hydrology, especially in an era of open data”* , 2018, Hydrological Sciences Journal.

[29]  Dany Savard,et al.  Open Metadata for Research Data Discovery in Canada , 2017 .

[30]  Lukas Gudmundsson,et al.  The Global Streamflow Indices and Metadata Archive (GSIM) – Part 2: Quality Control, Time-series Indices and Homogeneity Assessment , 2017 .

[31]  Ariel Deardorff,et al.  Open Science Framework (OSF) , 2017, Journal of the Medical Library Association : JMLA.

[32]  Denise M. Argue,et al.  Challenges with secondary use of multi-source water-quality data in the United States. , 2017, Water research.

[33]  Sabina Leonelli,et al.  The State of Open Data Report , 2016 .

[34]  Lex Nederbragt,et al.  Good enough practices in scientific computing , 2016, PLoS Comput. Biol..

[35]  Brian A. Nosek,et al.  How open science helps researchers succeed , 2016, eLife.

[36]  H. Laudon,et al.  Data rules: from personal belonging to community goods , 2016 .

[37]  M. Grubb,et al.  Opening up: open access publishing, data sharing, and how they can influence your neuroscience career , 2016, The European journal of neuroscience.

[38]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[39]  Robert Lanfear,et al.  Public Data Archiving in Ecology and Evolution: How Well Are We Doing? , 2015, PLoS biology.

[40]  William K. Michener,et al.  Ten Simple Rules for Creating a Good Data Management Plan , 2015, PLoS Comput. Biol..

[41]  Paul A. Longley,et al.  Geographic Information Science and Systems , 2015 .

[42]  C. Borgman,et al.  Ten Simple Rules for the Care and Feeding of Scientific Data , 2014, PLoS Comput. Biol..

[43]  Hanna Kokko,et al.  Troubleshooting Public Data Archiving: Suggestions to Increase Participation , 2013, PLoS biology.

[44]  Diane M. Orihel,et al.  SCIENTISTS, ON SAVING SCIENCE , 2013 .

[45]  John L. Campbell,et al.  Quantity is Nothing without Quality: Automated QA/QC for Streaming Environmental Sensor Data , 2013 .

[46]  William W. Taylor,et al.  How to Manage Data to Enhance Their Potential for Synthesis, Preservation, Sharing, and Reuse—A Great Lakes Case Study , 2013 .

[47]  Paul H. Whitfield,et al.  Why the Provenance of Data Matters: Assessing Fitness for Purpose for Environmental Data , 2012 .

[48]  M. Whitlock Data archiving in ecology and evolution: best practices. , 2011, Trends in ecology & evolution.

[49]  Yuji Tosaka,et al.  Metadata Creation Practices in Digital Repositories and Collections: Schemata, Selection Criteria, and Interoperability , 2010 .

[50]  Sharon S. Krag,et al.  Issues in Data Management , 2010, Sci. Eng. Ethics.

[51]  Heather A. Piwowar,et al.  Sharing Detailed Research Data Is Associated with Increased Citation Rate , 2007, PloS one.

[52]  R. Macdonald Rules , 2004, BMJ : British Medical Journal.

[53]  Charles J Vörösmarty,et al.  Widespread decline in hydrological monitoring threatens Pan-Arctic Research , 2002 .

[54]  T. A. Black,et al.  Summary and synthesis of Changing Cold Regions Network (CCRN) research in the interior of western Canada – Part 2: Future change in cryosphere, vegetation, and hydrology , 2020 .

[55]  Daryl J. McGoldrick,et al.  Development of environmental thresholds for nitrogen and phosphorus in streams. , 2012, Journal of environmental quality.

[56]  J Hilliard,et al.  Again and Again and Again , 2005 .

[57]  M. Torero,et al.  Best Practices , 2003, Regional Water Security.