FAIRshake: toolkit to evaluate the findability, accessibility, interoperability, and reusability of research digital resources

As more datasets, tools, workflows, APIs, and other digital resources are produced by the research community, it is becoming increasingly difficult to harmonize and organize these efforts for maximal synergistic integrated utilization. The Findable, Accessible, Interoperable, and Reusable (FAIR) guiding principles have prompted many stakeholders to consider strategies for tackling this challenge by making these digital resources follow common standards and best practices so that they can become more integrated and organized. Faced with the question of how to make digital resources more FAIR, it has become imperative to measure what it means to be FAIR. The diversity of resources, communities, and stakeholders have different goals and use cases and this makes assessment of FAIRness particularly challenging. To begin resolving this challenge, the FAIRshake toolkit was developed to enable the establishment of community-driven FAIR metrics and rubrics paired with manual, semi- and fully-automated FAIR assessment capabilities. The FAIRshake toolkit contains a database that lists registered digital resources, with their associated metrics, rubrics, and assessments. The FAIRshake toolkit also has a browser extension and a bookmarklet that enables viewing and submitting assessments from any website. The FAIR assessment results are visualized as an insignia that can be viewed on the FAIRshake website, or embedded within hosting websites. Using FAIRshake, a variety of bioinformatics tools, datasets listed on dbGaP, APIs registered in SmartAPI, workflows in Dockstore, and other biomedical digital resources were manually and automatically assessed for FAIRness. In each case, the assessments revealed room for improvement, which prompted enhancements that significantly upgraded FAIRness scores of several digital resources.

[1]  Laura Paglione,et al.  ORCID: a system to uniquely identify researchers , 2012, Learn. Publ..

[2]  Dan Brickley,et al.  Schema.org: Evolution of Structured Data on the Web , 2015, ACM Queue.

[3]  Marianne Winslett,et al.  EXTRUCT: Using Deep Structural Information in XML Keyword Search , 2010, Proc. VLDB Endow..

[4]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[5]  Dan Brickley,et al.  Resource Description Framework (RDF) Model and Syntax Specification , 2002 .

[6]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[7]  Simon Cox,et al.  OzNome 5-star Tool: A Rating System for making data FAIR and Trustable , 2017 .

[8]  Michel Dumontier,et al.  A design framework and exemplar metrics for FAIRness , 2017, Scientific Data.

[9]  Ingrid Dillo,et al.  Data Seal of Approval: Certification for sustainable and trusted data repositories , 2014 .

[10]  Luiz Olavo Bonino da Silva Santos,et al.  Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud , 2017, Inf. Serv. Use.

[11]  Damian Smedley,et al.  Next-generation diagnostics and disease-gene discovery with the Exomiser , 2015, Nature Protocols.

[12]  Massimiliano Izzo,et al.  FAIRsharing as a community approach to standards, repositories and policies , 2019, Nature Biotechnology.

[13]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[14]  Ian Smith,et al.  Implementation and relevance of FAIR data principles in biopharmaceutical R&D. , 2019, Drug discovery today.

[15]  Michel Dumontier,et al.  Bioschemas: schema.org for the Life Sciences , 2017, SWAT4LS.