Balancing Open Science and Data Privacy in the Water Sciences

Open science practices such as publishing data and code are transforming water science by enabling synthesis and enhancing reproducibility. However, as research increasingly bridges the physical and social science domains (e.g., socio‐hydrology), there is the potential for well‐meaning researchers to unintentionally violate the privacy and security of individuals or communities by sharing sensitive information. Here we identify the contexts in which privacy violations are most likely to occur, such as working with high‐resolution spatial data (e.g., from remote sensing), consumer data (e.g., from smart meters), and/or digital trace data (e.g., from social media). We also suggest practices for identifying and addressing privacy concerns at the individual, institutional, and disciplinary levels. We strongly advocate that the water science community continue moving toward open science and socio‐environmental research and that progress toward these goals be rooted in open and ethical data management.

[1]  D. Hyndman,et al.  Quantifying irrigation adaptation strategies in response to stakeholder-driven groundwater management in the US High Plains Aquifer , 2019, Environmental Research Letters.

[2]  V. Babovic,et al.  Advancing Opportunistic Sensing in Hydrology: A Novel Approach to Measuring Rainfall With Ordinary Surveillance Cameras , 2019, Water Resources Research.

[3]  R. Brook,et al.  Publication reform to safeguard wildlife from researcher harm , 2019, PLoS biology.

[4]  Michael Sinclair,et al.  Passive crowdsourcing of social media in environmental research: A systematic map , 2019, Global Environmental Change.

[5]  James H Stagge,et al.  Assessing data availability and research reproducibility in hydrology and water resources , 2019, Scientific Data.

[6]  E. Lunghi,et al.  Consider species specialism when publishing datasets , 2019, Nature Ecology & Evolution.

[7]  Murugesu Sivapalan,et al.  Expanding the Scope and Foundation of Sociohydrology as the Science of Coupled Human‐Water Systems , 2019, Water Resources Research.

[8]  Mark Bakker,et al.  Data‐Sharing Requires Script‐Sharing , 2019, Ground water.

[9]  M. Gavin,et al.  A global assessment of Indigenous community engagement in climate research , 2018, Environmental Research Letters.

[10]  V. Butsic,et al.  The emergence of cannabis agriculture frontiers as environmental threats , 2018, Environmental Research Letters.

[11]  Stephanie E Hampton,et al.  Open science, reproducibility, and transparency in ecology. , 2018, Ecological applications : a publication of the Ecological Society of America.

[12]  Andreas Scheidegger,et al.  Urban overland runoff velocity measurement with consumer-grade surveillance cameras and surface structure image velocimetry , 2018, Journal of Hydrology.

[13]  Mark Dredze,et al.  Don’t quote me: reverse identification of research participants in social media studies , 2018, npj Digital Medicine.

[14]  N. Crosbie,et al.  Wastewater-based epidemiology biomarkers: Past, present and future , 2018, TrAC Trends in Analytical Chemistry.

[15]  Vivitskaia J. D. Tulloch,et al.  A decision tree for assessing the risks and benefits of publishing biodiversity data , 2018, Nature Ecology & Evolution.

[16]  David B. Lobell,et al.  Satellite detection of cover crops and their effects on crop yield in the Midwestern United States , 2018, Environmental Research Letters.

[17]  K. Chief Emerging Voices of Tribal Perspectives in Water Resources , 2018 .

[18]  Yiannis Gkoufas,et al.  PRIMA: An End-to-End Framework for Privacy at Scale , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[19]  S. Zipper,et al.  Sociohydrological Impacts of Water Conservation Under Anthropogenic Drought in Austin, TX (USA) , 2018 .

[20]  Z. Huo,et al.  Untangling the effects of shallow groundwater and deficit irrigation on irrigation water productivity in arid region: New conceptual model. , 2018, The Science of the total environment.

[21]  Michelle N. Meyer,et al.  Practical Tips for Ethical Data Sharing , 2018 .

[22]  L. Condon,et al.  Drones in Geoscience Research: The Sky Is the Only Limit , 2018 .

[23]  Casey Fiesler,et al.  “Participant” Perceptions of Twitter Research Ethics , 2018 .

[24]  Lynn Yarmey,et al.  Enabling FAIR Data Across the Earth and Space Sciences , 2017 .

[25]  S. Zipper,et al.  Socio-environmental drought response in a mixed urban-agricultural setting: synthesizing biophysical and governance responses in the Platte River Watershed, Nebraska, USA , 2017 .

[26]  V. Butsic,et al.  Inside the Emerald Triangle: Modeling the Placement and Size of Cannabis Production in Humboldt County, CA USA , 2017 .

[27]  Jillian M. Deines,et al.  Annual Irrigation Dynamics in the U.S. Northern High Plains Derived from Landsat Satellite Data , 2017 .

[28]  Jeffery S. Horsburgh,et al.  Data Management Dimensions of Social Water Science: The iUTAH Experience , 2017 .

[29]  Mark D. McCoy Geospatial Big Data and archaeology: Prospects and problems too great to ignore☆ , 2017 .

[30]  David Lindenmayer,et al.  A subcellular map of the human proteome , 2017, Science.

[31]  C. Wardropper,et al.  Public access to spatial data on private-land conservation , 2017 .

[32]  Ning Jiang,et al.  Our path to better science in less time using open data science tools , 2017, Nature Ecology &Evolution.

[33]  Chen Xu,et al.  Tracing the Spatial-Temporal Evolution of Events Based on Social Media Data , 2017, ISPRS Int. J. Geo Inf..

[34]  Christa Brelsford,et al.  Growing into Water Conservation? Decomposing the Drivers of Reduced Water Consumption in Las Vegas, NV , 2017 .

[35]  K. Whyte What Do Indigenous Knowledges Do for Indigenous Peoples? , 2017 .

[36]  Joanna Radin,et al.  “Digital Natives”: How Medical and Indigenous Histories Matter for Big Data , 2017, Osiris.

[37]  John P. A. Ioannidis,et al.  A manifesto for reproducible science , 2017, Nature Human Behaviour.

[38]  Andrea Castelletti,et al.  Using crowdsourced web content for informing water systems operations in snow-dominated catchments , 2016 .

[39]  C. Kucharik,et al.  Explicit modeling of abiotic and landscape factors reveals precipitation and forests associated with aphid abundance. , 2016, Ecological applications : a publication of the Ecological Society of America.

[40]  Min Liu,et al.  Validating city-scale surface water flood modelling using crowd-sourced data , 2016 .

[41]  S. Kanae,et al.  Differences in flood hazard projections in Europe – their causes and consequences for decision making , 2016 .

[42]  N Michelsen,et al.  YouTube as a crowd-generated water level archive. , 2016, The Science of the total environment.

[43]  Jérôme Le Coz,et al.  Crowdsourced data for flood hydrology: Feedback from recent citizen science projects in Argentina, France and New Zealand , 2016 .

[44]  K. Whyte,et al.  Engaging Southwestern Tribes in Sustainable Water Resources Topics and Management , 2016 .

[45]  Luis Alonso,et al.  How Universal Is the Relationship between Remotely Sensed Vegetation Indices and Crop Leaf Area Index? A Global Assessment , 2016, Remote. Sens..

[46]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[47]  Max N. Helveston Consumer Protection in the Age of Big Data , 2016 .

[48]  Rachel Cardell-Oliver,et al.  Smart Meter Analytics to Pinpoint Opportunities for Reducing Household Water Use , 2016 .

[49]  Masooda Bashir,et al.  Online privacy and informed consent: The dilemma of information asymmetry , 2015, ASIST.

[50]  Andrea Castelletti,et al.  Benefits and challenges of using smart meters for advancing residential water demand modeling and management: A review , 2015, Environ. Model. Softw..

[51]  Alexandra B. Klass,et al.  Remaking Energy: The Critical Role of Energy Consumption Data , 2015 .

[52]  Samuel C. Zipper,et al.  Untangling the effects of shallow groundwater and soil texture as drivers of subfield‐scale yield variability , 2015 .

[53]  J. Olson,et al.  Impacts of Surface Water Diversions for Marijuana Cultivation on Aquatic Habitat in Four Northwestern California Watersheds , 2015, PloS one.

[54]  Jeremy Prichard,et al.  Sewage epidemiology and illicit drug research: the development of ethical research guidelines. , 2014, The Science of the total environment.

[55]  A. Lupia,et al.  Openness in Political Science: Data Access and Research Transparency , 2013, PS: Political Science & Politics.

[56]  Yianni Lagos Jules Polonetsky Public vs. Nonpublic Data: The Benefits of Administrative Control , 2013 .

[57]  Rodney Anthony Stewart,et al.  Smart metering: enabler for rapid and effective post meter leakage identification and water loss management , 2013 .

[58]  C. Strasser,et al.  Spatially Explicit Data: Stewardship and Ethical Challenges in Science , 2013, PLoS biology.

[59]  Rodney Anthony Stewart,et al.  Smart meter enabled disaggregation of urban peak water demand: precursor to effective urban water planning , 2013 .

[60]  J. Gordon Arbuckle,et al.  Farmer Attitudes toward Proactive Targeting of Agricultural Conservation Programs , 2013 .

[61]  Michael N. Fienen,et al.  Social.Water - A crowdsourcing tool for environmental data acquisition , 2012, Comput. Geosci..

[62]  Heejun Chang,et al.  Land-use, temperature, and single-family residential water use patterns in Portland, Oregon and Phoenix, Arizona , 2012 .

[63]  Christoph Ort,et al.  An analysis of ethical issues in using wastewater analysis to monitor illicit drug use. , 2012, Addiction.

[64]  Okmyung Bin,et al.  Changes in Implicit Flood Risk Premiums: Empirical Evidence from the Housing Market , 2012 .

[65]  Zhengwei Yang,et al.  CropScape: A Web service based application for exploring and disseminating US conterminous geospatial cropland data products for decision support , 2012 .

[66]  Kevin P. Donovan Seeing Like a Slum: Towards Open, Deliberative Development , 2012 .

[67]  G. Blöschl,et al.  Socio‐hydrology: A new science of people and water , 2012 .

[68]  Ian Richardson,et al.  Smart meter data: Balancing consumer privacy concerns with legitimate applications , 2012 .

[69]  Avi Ostfeld,et al.  Topological clustering for water distribution systems analysis , 2011, Environ. Model. Softw..

[70]  Michael Gurstein,et al.  Open data: Empowering the empowered or effective data use for everyone? , 2011, First Monday.

[71]  Prashant J. Shenoy,et al.  Private memoirs of a smart meter , 2010, BuildSys '10.

[72]  Peter A. Troch,et al.  The future of hydrology: An evolving science for a changing world , 2010 .

[73]  J. Loomis,et al.  Do Repeated Wildfires Change Homebuyers’ Demand for Homes in High-Risk Areas? A Hedonic Analysis of the Short and Long-Term Effects of Repeated Wildfires on House Prices in Southern California , 2009 .

[74]  E. Vaz GIS from a cultural heritage perspective: when past and future collide , 2008 .

[75]  B. Vastag do not publish , 2008 .

[76]  Gary King,et al.  An Introduction to the Dataverse Network as an Infrastructure for Data Sharing , 2007 .

[77]  B. Craig,et al.  Online Satellite and Aerial Images: Issues and Analysis , 2007 .

[78]  D. Resnik,et al.  Protecting third parties in human subjects research. , 2006, IRB.

[79]  D. Brugge,et al.  Protecting the Navajo people through tribal regulation of research , 2006, Science and engineering ethics.

[80]  Jeremy Sugarman,et al.  Ethical goals of community consultation in research. , 2005, American journal of public health.

[81]  Claudia Copeland,et al.  Terrorism and Security Issues Facing the Water Infrastructure Sector , 2002 .

[82]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[83]  S. Zipper Agricultural Research Using Social Media Data , 2018 .

[84]  D. Resnik The Ethics of Research with Human Subjects , 2018 .

[85]  Gail Clement,et al.  Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance , 2017 .

[86]  M. Sivapalan,et al.  Prediction in a socio-hydrological world , 2017 .

[87]  Stefan Daume,et al.  Mining Twitter to monitor invasive alien species - An analytical framework and sample information topologies , 2016, Ecol. Informatics.

[88]  Kimberly Christen Tribal Archives, Traditional Knowledge, and Local Contexts: Why the “s” Matters , 2015 .

[89]  H. Vincent Poor,et al.  Smart Meter Privacy: A Theoretical Framework , 2013, IEEE Transactions on Smart Grid.

[90]  Christopher S Lowry,et al.  CrowdHydrology: Crowdsourcing Hydrologic Data and Engaging Citizen Scientists , 2013, Ground water.

[91]  James A. Sonne IN THE UNITED STATES COURT OF APPEALS FOR THE SEVENTH CIRCUIT , 2013 .

[92]  Laurie J. Van Leuven Water/Wastewater Infrastructure Security: Threats and Vulnerabilities , 2011 .

[93]  T. McClean Not with a Bang but a Whimper: The Politics of Accountability and Open Data in the UK , 2011 .

[94]  Kevin Crowston,et al.  Validity Issues in the Use of Social Network Analysis with Digital Trace Data , 2011, J. Assoc. Inf. Syst..

[95]  Erik Emke,et al.  Wastewater-based epidemiology , 2010 .

[96]  WU FELIXT.,et al.  DEFINING PRIVACY AND UTILITY IN DATA SETS , 2022 .