Data Publications Correlate with Citation Impact

Neuroscience and molecular biology have been generating large datasets over the past years that are reshaping how research is being conducted. In their wake, open data sharing has been singled out as a major challenge for the future of research. We conducted a comparative study of citations of data publications in both fields, showing that the average publication tagged with a data-related term by the NCBI MeSH (Medical Subject Headings) curators achieves a significantly larger citation impact than the average in either field. We introduce a new metric, the data article citation index (DAC-index), to identify the most prolific authors among those data-related publications. The study is fully reproducible from an executable Rmd (R Markdown) script together with all the citation datasets. We hope these results can encourage authors to more openly publish their data.

[1]  Juan Lu,et al.  IMPACT database of traumatic brain injury: design and description. , 2007, Journal of neurotrauma.

[2]  James A. Evans,et al.  Open Access and Global Participation in Science , 2009, Science.

[3]  Dragomir R. Radev,et al.  A bibliometric and network analysis of the field of computational linguistics , 2016, J. Assoc. Inf. Sci. Technol..

[4]  Adam R Ferguson,et al.  Big data from small data: data-sharing in the 'long tail' of neuroscience , 2014, Nature Neuroscience.

[5]  Steen Moeller,et al.  The Human Connectome Project: A data acquisition perspective , 2012, NeuroImage.

[6]  David N. Kennedy,et al.  The Resource Identification Initiative: A cultural shift in publishing , 2015, Neuroinformatics.

[7]  Giorgio A. Ascoli,et al.  Doubling up on the Fly: NeuroMorpho.Org Meets Big Data , 2014, Neuroinformatics.

[8]  Vincent Larivière,et al.  Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research , 2010, PloS one.

[9]  Krzysztof J. Gorgolewski,et al.  Making Data Sharing Count: A Publication-Based Solution , 2012, Front. Neurosci..

[10]  Kimberly Van Auken,et al.  Recent advances in biocuration: Meeting Report from the fifth International Biocuration Conference , 2012, Database J. Biol. Databases Curation.

[11]  Krzysztof J. Gorgolewski,et al.  Making big data open: data sharing in neuroimaging , 2014, Nature Neuroscience.

[12]  Vishwas Chavan,et al.  The data paper: a mechanism to incentivize data publishing in biodiversity science , 2011, BMC Bioinformatics.

[13]  Bryn Nelson Data sharing: Empty archives , 2009, Nature.

[14]  Hans-Michael Müller,et al.  The Neuroscience Information Framework: A Data and Knowledge Environment for Neuroscience , 2008, Neuroinformatics.

[15]  J Anthony Movshon,et al.  Putting big data to good use in neuroscience , 2014, Nature Neuroscience.

[16]  Sean L. Hill,et al.  BigNeuron: Large-Scale 3D Neuron Reconstruction from Optical Microscopy Images , 2015, Neuron.

[17]  Pardis C. Sabeti,et al.  Data sharing: Make outbreak research open access , 2015, Nature.

[18]  Martin Bobrow,et al.  Funders must encourage scientists to share , 2015, Nature.

[19]  Yvonne M. Socha,et al.  Out of Cite, Out of Mind: The Current State of Practice, Policy, and Technology for the Citation of Data , 2013, Data Sci. J..

[20]  Heather A. Piwowar,et al.  Data reuse and the open data citation advantage , 2013, PeerJ.

[21]  David N. Kennedy,et al.  Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC) Resource Announcement , 2009, Neuroinformatics.

[22]  M. Milham Open Neuroscience Solutions for the Connectome-wide Association Era , 2012, Neuron.

[23]  M. Walport,et al.  Science as a public enterprise: the case for open data , 2011, The Lancet.

[24]  Bruce R. Rosen,et al.  Enabling collaborative research using the Biomedical Informatics Research Network (BIRN) , 2011, J. Am. Medical Informatics Assoc..

[25]  Heather A. Piwowar,et al.  Sharing Detailed Research Data Is Associated with Increased Citation Rate , 2007, PloS one.

[26]  A. D. Jackson,et al.  Measures for measures , 2006, Nature.

[27]  Mark Walport,et al.  Sharing research data to improve public health , 2011, The Lancet.