Research Data Preservation Using Process Engines and Machine-Actionable Data Management Plans

Scientific experiments in various domains require nowadays collecting, processing, and reusing data. Researchers have to comply with funder policies that prescribe how data should be managed, shared and preserved. In most cases this has to be documented in data management plans. When data is selected and moved into a repository when project ends, it is often hard for researchers to identify which files need to be preserved and where they are located. For this reason, we need a mechanism that allows researchers to integrate preservation functionality into their daily workflows of data management to avoid situations in which scientific data is not properly preserved.

[1]  Frederica Darema,et al.  Dynamic Data Driven Applications Systems: A New Paradigm for Application Simulations and Measurements , 2004, International Conference on Computational Science.

[2]  Tomasz Miksa,et al.  Identifying impact of software dependencies on replicability of biomedical workflows , 2016, J. Biomed. Informatics.

[3]  Anthony J. G. Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery [Point of View] , 2011 .

[4]  Tomasz Miksa,et al.  Information Integration for Machine Actionable Data Management Plans , 2017, Int. J. Digit. Curation.

[5]  Patrício Domingues,et al.  Open Source Software for Digital Preservation Repositories: a Survey , 2017, ArXiv.

[6]  Michel Castagné Institutional repository software comparison: DSpace, EPrints, Digital Commons, Islandora and Hydra , 2013 .

[7]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[8]  Jean Gabriel Bankier,et al.  Institutional Repository Software Comparison , 2014 .

[9]  Rauber Andreas,et al.  Precise Data Identification Services for Long Tail Research Data , 2016 .

[10]  Björn Schembera,et al.  Challenges of Research Data Management for High Performance Computing , 2017, TPDL.

[11]  Tomasz Miksa,et al.  Machine-actionable data management plans (maDMPs) , 2017 .

[12]  Robin Dasler,et al.  CERN Analysis Preservation: A Novel Digital Library Service to Enable Reusable and Reproducible Research , 2016, TPDL.