Crossref: The sustainable source of community-owned scholarly metadata

This paper describes the scholarly metadata collected and made available by Crossref, as well as its importance in the scholarly research ecosystem. Containing over 106 million records and expanding at an average rate of 11% a year, Crossref’s metadata has become one of the major sources of scholarly data for publishers, authors, librarians, funders, and researchers. The metadata set consists of 13 content types, including not only traditional types, such as journals and conference papers, but also data sets, reports, preprints, peer reviews, and grants. The metadata is not limited to basic publication metadata, but can also include abstracts and links to full text, funding and license information, citation links, and the information about corrections, updates, retractions, etc. This scale and breadth make Crossref a valuable source for research in scientometrics, including measuring the growth and impact of science and understanding new trends in scholarly communications. The metadata is available through a number of APIs, including REST API and OAI-PMH. In this paper, we describe the kind of metadata that Crossref provides and how it is collected and curated. We also look at Crossref’s role in the research ecosystem and trends in metadata curation over the years, including the evolution of its citation data provision. We summarize the research used in Crossref’s metadata and describe plans that will improve metadata quality and retrieval in the future.

[1]  Martin Klein,et al.  Comparing published scientific journal articles to their pre-print versions , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[2]  Thed N. van Leeuwen,et al.  Evidence of Open Access of scientific publications in Google Scholar: a large-scale analysis , 2018 .

[3]  Ralf Schenkel Integrating and Exploiting Public Metadata Sources in a Bibliographic Information System , 2018, BIR@ECIR.

[4]  Thed N. van Leeuwen,et al.  Evidence of Open Access of scientific publications in Google Scholar: a large-scale analysis , 2018, J. Informetrics.

[5]  Michelle L. Dion,et al.  Gendered Citation Patterns across Political Science and Social Science Methodology Fields , 2018, Political Analysis.

[6]  Andrea Giovanni Nuzzolese,et al.  The practice of self-citations: a longitudinal study , 2019, Scientometrics.

[7]  Rodrigo Costas,et al.  General discussion of data quality challenges in social media metrics: Extensive comparison of four major altmetric data aggregators , 2018, PloS one.

[8]  Shaun Yon-Seng Khoo,et al.  Article Processing Charge Hyperinflation and Price Insensitivity: An Open Access Sequel to the Serials Crisis , 2019, LIBER Quarterly: The Journal of the Association of European Research Libraries.

[9]  Petr Knoth,et al.  Do Authors Deposit on Time? Tracking Open Access Policy Compliance , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[10]  Justin Esarey,et al.  Are Papers Written by Women Authors Cited Less Frequently? , 2018, Political Analysis.

[11]  Daniel J Hicks,et al.  Network analysis to evaluate the impact of research funding on research community consolidation , 2019, PloS one.

[12]  Aliakbar Akbaritabar,et al.  Merits and Limits: Applying open data to monitor open access publications in bibliometric databases , 2019, ISSI.

[13]  José Luis Ortega Reliability and accuracy of altmetric providers: a comparison among Altmetric.com, PlumX and Crossref Event Data , 2018, Scientometrics.

[14]  Mikael Laakso,et al.  The Two-Way Street of Open Access Journal Publishing: Flip It and Reverse It , 2019, Publ..

[15]  Demmy Verbeke,et al.  Research data management and the evolutions of scholarship , 2019, LIBER Quarterly: The Journal of the Association of European Research Libraries.

[16]  Yi Yu,et al.  Link prediction for interdisciplinary collaboration via co-authorship network , 2018, Social Network Analysis and Mining.

[17]  José Luis Ortega,et al.  Disciplinary differences of the impact of altmetric , 2018, FEMS microbiology letters.