Why data citation isn't working, and what to do about it

Abstract We describe a system that automatically generates from a curated database a collection of short conventional publications—citation summaries—that describe the contents of various components of the database. The purpose of these summaries is to ensure that the contributors to the database receive appropriate credit through the currently used measures such as h-indexes. Moreover, these summaries also serve to give credit to publications and people that are cited by the database. In doing this, we need to deal with granularity—how many summaries should be generated to represent effectively the contributions to a database? We also need to deal with evolution—for how long can a given summary serve as an appropriate reference when the database is evolving? We describe a journal specifically tailored to contain these citation summaries. We also briefly discuss the limitations that the current mechanisms for recording citations place on both the process and value of data citation.

[1]  Peter Buneman,et al.  How to cite curated databases and how to make them citable , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[2]  Alan H. Strahler,et al.  The Moderate Resolution Imaging Spectroradiometer (MODIS): land remote sensing for global change research , 1998, IEEE Trans. Geosci. Remote. Sens..

[3]  John Willinsky,et al.  Open Journal Systems: An example of open source software for journal management and publishing , 2005, Libr. Hi Tech.

[4]  Lily Troia,et al.  A Data Citation Roadmap for Scholarly Data Repositories , 2017 .

[5]  E GARFIELD,et al.  Citation indexes for science; a new dimension in documentation through association of ideas. , 2006, Science.

[6]  Joanna L. Sharman,et al.  The IUPHAR/BPS Guide to PHARMACOLOGY in 2016: towards curated quantitative interactions between 1300 protein targets and 6000 ligands , 2015, Nucleic Acids Res..

[7]  Daniel Deutch,et al.  A Model for Fine-Grained Data Citation , 2017, CIDR.

[8]  Paul T. Groth,et al.  The anatomy of a nanopublication , 2010, Inf. Serv. Use.

[9]  Abdussalam Alawini,et al.  Data Citation: Giving Credit Where Credit is Due , 2018, SIGMOD Conference.

[10]  Daniel Deutch,et al.  ProvCite: Provenance-based Data Citation , 2019, Proc. VLDB Endow..

[11]  Stephen PH Alexander,et al.  The Concise Guide to Pharmacology 2013/14: G Protein-Coupled Receptors , 2013, British journal of pharmacology.

[12]  Juan Carlos De Martin,et al.  The Digital Public Domain: Foundations for an Open Culture , 2012 .

[13]  Stephen PH Alexander,et al.  The Concise Guide to PHARMACOLOGY 2015/16: Overview , 2015, British journal of pharmacology.

[14]  Xosé M. Fernández-Suárez,et al.  The 2018 Nucleic Acids Research database issue and the online molecular biology database collection , 2017, Nucleic Acids Res..

[15]  James Frew,et al.  Why data citation is a computational problem , 2016, Commun. ACM.

[16]  Vincent Larivière,et al.  The Journal Impact Factor: A brief history, critique, and discussion of adverse effects , 2018, Springer Handbook of Science and Technology Indicators.

[17]  Andreas Rauber,et al.  Scalable data citation in dynamic, large databases: Model and reference implementation , 2013, 2013 IEEE International Conference on Big Data.

[18]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[19]  Anthony Wirth,et al.  Correlation Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[20]  Wei Hu,et al.  Automating Data Citation in CiteDB , 2017, Proc. VLDB Endow..

[21]  Sarah Callaghan,et al.  Joint declaration of data citation principles , 2014 .

[22]  D. Hay,et al.  Calcitonin receptors (version 2019.4) in the IUPHAR/BPS Guide to Pharmacology Database , 2019, IUPHAR/BPS Guide to Pharmacology CITE.

[23]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..