Framing the scope of the common data model for machine-actionable Data Management Plans

Currently, research requires processing data at a large scale. Data is not anymore a collection of static documents, but often a continuous stream of information flowing into information systems. Researchers need to manage their data efficiently not only to keep it safe, but also to ensure that it can be later correctly interpreted and reused. Existing solutions are not sufficient. Traditional Data Management Plans are manually created text documents that describe how research data will be handled. Yet, researchers must implement all actions by themselves. Machine-actionable Data Management Plans are a new approach that allows systems to act on behalf of researchers and other stakeholders involved in data management, to help them manage data in an efficient and scalable way. This paper summarises the results of work performed by the Research Data Alliance working group on Data Management Plan Common Standards to realise this vision. The paper describes results of consultations and proof of concept tools that help in: identifying needs for information of stakeholders involved in data management; defining the scope of the common data model for Machine-actionable Data Management Plans to allow for exchange of information between systems; identifying necessary services and components of infrastructure that support automation of data management tasks.

[1]  Tomasz Miksa,et al.  Machine-actionable data management plans (maDMPs) , 2017 .

[2]  Tomasz Miksa,et al.  304.1 Defining requirements for machine-actionable data management plans , 2018 .

[3]  Tomasz Miksa,et al.  Ten simple rules for machine-actionable data management plans (preprint) , 2018 .

[4]  Tomasz Miksa,et al.  Research Data Preservation Using Process Engines and Machine-Actionable Data Management Plans , 2018, TPDL.

[5]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[6]  Tomasz Miksa,et al.  VPlan - Ontology for Collection of Process Verification Data , 2014, iPRES.

[7]  Jian Qin Infrastructure, Standards, and Policies for Research Data Management , 2013 .

[8]  Angus Whyte,et al.  Making the Case for Research Data Management , 2011 .

[9]  Mike Cohn,et al.  User Stories Applied: For Agile Software Development , 2004 .

[10]  Anne E. Trefethen,et al.  The Data Deluge: An e-Science Perspective , 2003 .

[11]  Tomasz Miksa,et al.  Defining requirements for machine-actionable data management plans , 2018, iPRES.

[12]  Hans-Christoph Hobohm,et al.  Research Data Management , 2011, Handbuch Forschungsdatenmanagement.

[13]  William K. Michener,et al.  Ten Simple Rules for Creating a Good Data Management Plan , 2015, PLoS Comput. Biol..

[14]  Cees T. A. M. de Laat,et al.  Addressing big data issues in Scientific Data Infrastructure , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[15]  Quirin Schiermeier Data management made simple , 2018, Nature.

[16]  Anthony J. G. Hey,et al.  Jim Gray on eScience: a transformed scientific method , 2009, The Fourth Paradigm.

[17]  Brian A. Nosek,et al.  Promoting an open research culture , 2015, Science.

[18]  David J. DeWitt,et al.  Scientific data management in the coming decade , 2005, SGMD.