Identifying Data Sharing and Reuse with Scholix: Potentials and Limitations

Summary The Scholexplorer API, based on the Scholix (Scholarly Link eXchange) framework, aims to identify links between articles and supporting data. This quantitative case study demonstrates that the API vastly expanded the number of datasets previously known to be affiliated with University of Bath outputs, allowing improved monitoring of compliance with funder mandates by identifying peer-reviewed articles linked to at least one unique dataset. Availability of author names for research outputs increased from 2.4% to 89.2%, which enabled identification of ten articles reusing non-Bath-affiliated datasets published in external repositories in the first phase, giving valuable evidence of data reuse and impact for data producers. Of these, only three were formally cited in the references. Further enhancement of the Scholix schema and enrichment of Scholexplorer metadata using controlled vocabularies would be beneficial. The adoption of standardized data citations by journals will be critical to creating links in a more systematic manner.

[1]  Gemma Hersh Making Open Access/Open Data/Open Science A Reality , 2017 .

[2]  I. Hrynaszkiewicz Publishers' Responsibilities in Promoting Data Quality and Reproducibility. , 2019, Handbook of experimental pharmacology.

[3]  Nushrat Khan Dataset for "Identifying Data Sharing and Reuse with Scholix – Potentials and Limitations" , 2020 .

[4]  Kathleen Marie Fear,et al.  Measuring and Anticipating the Impact of Data Reuse. , 2013 .

[5]  Elizabeth A. Hull,et al.  The location of the citation: changing practices in how publications cite original data in the Dryad Digital Repository , 2016 .

[6]  Klaus Tochtermann,et al.  Linked Publications and Research Data: Use Cases for Digital Libraries , 2018, TPDL.

[7]  Kei Koizumi,et al.  Increasing Access to the Results of Federally Funded Scientific Research , 2016 .

[8]  Natasha Simons,et al.  Bringing Citations and Usage Metrics Together to Make Data Count , 2019, Data Sci. J..

[9]  Carly Strasser,et al.  Making data count , 2015, Scientific data.

[10]  Gianmaria Silvello,et al.  Theory and practice of data citation , 2017, J. Assoc. Inf. Sci. Technol..

[11]  Thea Marie Drachen,et al.  Sharing data increases citations , 2016 .

[12]  C. Borgman,et al.  If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology , 2013, PloS one.

[13]  Mike Thelwall,et al.  Data Citation and Reuse Practice in Biodiversity - Challenges of Adopting a Standard Citation Model , 2019, ISSI.

[14]  Heather A. Piwowar,et al.  Sharing Detailed Research Data Is Associated with Increased Citation Rate , 2007, PloS one.

[15]  Alberto Accomazzi,et al.  Linking to Data - Effect on Citation Rates in Astronomy , 2011, ArXiv.

[16]  Paolo Manghi,et al.  The Scholix Framework for Interoperability in Data-Literature Information Exchange , 2017, D Lib Mag..

[17]  Christoph Lange,et al.  Identifying and Improving Dataset References in Social Sciences Full Texts , 2016, ELPUB.

[18]  Barbara McGillivray,et al.  The citation advantage of linking publications to research data , 2019, PloS one.

[19]  Evaristo Jiménez-Contreras,et al.  Analyzing data citation practices using the data citation index , 2015, J. Assoc. Inf. Sci. Technol..

[20]  Paolo Manghi,et al.  The data-literature interlinking service: Towards a common infrastructure for sharing data-article links , 2017, Program.

[21]  Brigitte Mathiak,et al.  Challenges in Matching Dataset Citation Strings to Datasets in Social Science , 2015, D Lib Mag..