Data Provenance in Citizen Science Databases

Today, more and more scientific groups are developing citizen science applications. Citizen science is a relatively new domain of science that has already proved to be as beneficial as classical science. One of the major challenges citizen science face is the data quality assurance. It uses several techniques to verify the data quality based on expert evaluation, voting systems, etc. Data provenance is used in many scientific systems and provides reliable mechanism for tracking data history. It includes history of origin, changes, and all interactions between different parts of data. Data provenance by itself has many types such as “Why provenance”, “When provenance”, and “What provenance”. The purpose of this work is to build a prototype of a database with built-in data provenance. Several databases systems and models such as Relational databases, NoSQL databases are taken into consideration. Experiments are been conducted to test limitations of proposed prototype.

[1]  Michael Stonebraker,et al.  SQL databases v. NoSQL databases , 2010, CACM.

[2]  Kristine F. Stepenuck,et al.  Citizen science can improve conservation science, natural resource management, and environmental protection , 2017 .

[3]  Minos N. Garofalakis,et al.  Query Analytics over Probabilistic Databases with Unmerged Duplicates , 2015, IEEE Transactions on Knowledge and Data Engineering.

[4]  Stuart E. Madnick,et al.  A Polygen Model for Heterogeneous Database Systems: The Source Tagging Perspective , 1990, VLDB.

[5]  Devdatta Kulkarni,et al.  A fine-grained access control model for key-value systems , 2013, CODASPY.

[6]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A tertiary study , 2010, Inf. Softw. Technol..

[7]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[8]  Andrea Wiggins,et al.  Community-based Data Validation Practices in Citizen Science , 2016, CSCW.

[9]  Loren G. Terveen,et al.  Capturing quality: retaining provenance for curated volunteer monitoring data , 2014, CSCW.

[10]  R. Bonney,et al.  Next Steps for Citizen Science , 2014, Science.

[11]  David N. Bonter,et al.  Citizen Science as an Ecological Research Tool: Challenges and Benefits , 2010 .

[12]  Nargess Memarsadeghi Citizen Science , 2015, Comput. Sci. Eng..

[13]  Richard E. Lewis,et al.  A Multicountry Assessment of Tropical Resource Monitoring by Local Communities , 2014 .

[14]  Claire Ellul,et al.  A Flexible Database-Centric Platform for Citizen Science Data Capture , 2011, 2011 IEEE Seventh International Conference on e-Science Workshops.

[15]  Michael Stonebraker,et al.  Supporting fine-grained data lineage in a database visualization environment , 1997, Proceedings 13th International Conference on Data Engineering.

[16]  Kyle Johnsen,et al.  Citizen-Based Litter and Marine Debris Data Collection and Mapping , 2015, Computing in Science & Engineering.