Measuring the Value of Research Data: A Citation Analysis of Oceanographic Data Sets

Evaluation of scientific research is becoming increasingly reliant on publication-based bibliometric indicators, which may result in the devaluation of other scientific activities - such as data curation – that do not necessarily result in the production of scientific publications. This issue may undermine the movement to openly share and cite data sets in scientific publications because researchers are unlikely to devote the effort necessary to curate their research data if they are unlikely to receive credit for doing so. This analysis attempts to demonstrate the bibliometric impact of properly curated and openly accessible data sets by attempting to generate citation counts for three data sets archived at the National Oceanographic Data Center. My findings suggest that all three data sets are highly cited, with estimated citation counts in most cases higher than 99% of all the journal articles published in Oceanography during the same years. I also find that methods of citing and referring to these data sets in scientific publications are highly inconsistent, despite the fact that a formal citation format is suggested for each data set. These findings have important implications for developing a data citation format, encouraging researchers to properly curate their research data, and evaluating the bibliometric impact of individuals and institutions.

[1]  J. Lawrimore,et al.  Building trust in climate science: data products for the 21st century , 2012 .

[2]  A. Sterl,et al.  The ERA‐40 re‐analysis , 2005 .

[3]  Krzysztof J. Gorgolewski,et al.  Making Data Sharing Count: A Publication-Based Solution , 2012, Front. Neurosci..

[4]  Péter Jacsó,et al.  Metadata mega mess in Google Scholar , 2010, Online Inf. Rev..

[5]  S. Gorshkov,et al.  World ocean atlas , 1976 .

[6]  Farid Neema,et al.  Data sharing , 1998 .

[7]  Erik M. Conway,et al.  Drowning in data: Satellite oceanography and information overload in the Earth sciences , 2006 .

[8]  Elizabeth Gross,et al.  IOC Contributions to Science Synthesis , 2010 .

[9]  Hailey Mooney,et al.  The Anatomy of a Data Citation: Discovery, Reuse, and Credit , 2012 .

[10]  A. Lyon Dealing with data , 1970 .

[11]  D. L. Anderson,et al.  Electronic data publication in geochemistry: A plea for “full disclosure” , 2001 .

[12]  J. L. Harrison,et al.  The Government Printing Office , 1968, American Journal of Pharmaceutical Education.

[13]  Henk F. Moed,et al.  UK Research Assessment Exercises: Informed judgments on research quality or quantity? , 2008, Scientometrics.

[14]  Michael Diepenbroek,et al.  Webservices Infrastructure for the Registration of Scientific Primary Data , 2005, ECDL.

[15]  Craig J. Donlon,et al.  The GODAE High Resolution Sea Surface Temperature Pilot Project (GHRSST-PP) , 2009 .

[16]  Dave Roberts,et al.  Towards mainstreaming of biodiversity data publishing: recommendations of the GBIF Data Publishing Framework Task Group , 2011, BMC Bioinformatics.

[17]  Sydney Levitus,et al.  World ocean atlas 2005. Vol. 4, Nutrients (phosphate, nitrate, silicate) , 2006 .

[18]  Anthony F. J. van Raan,et al.  Advanced bibliometric methods as quantitative core of peer review based evaluation and foresight exercises , 1996, Scientometrics.

[19]  Thed N. van Leeuwen,et al.  The Leiden ranking 2011/2012: Data collection, indicators, and interpretation , 2012, J. Assoc. Inf. Sci. Technol..

[20]  Hendrik P. van Dalen,et al.  Intended and Unintended Consequences of a Publish-or-Perish Culture: A Worldwide Survey , 2012, J. Assoc. Inf. Sci. Technol..

[21]  S. Levitus Climatological Atlas of the World Ocean , 1982 .

[22]  Tindaro Cicero,et al.  National peer-review research assessment exercises for the hard sciences can be a complete waste of money: the Italian case , 2013, Scientometrics.

[23]  Henk F. Moed,et al.  The future of research evaluation rests with an intelligent combination of advanced metrics and transparent peer review , 2007 .

[24]  Peter J. Minnett,et al.  The Global Ocean Data Assimilation Experiment High-resolution Sea Surface Temperature Pilot Project , 2007 .

[25]  Marcia McNutt,et al.  Data sharing , 2016, Science.

[26]  Nicole Haeffner-Cavaillon,et al.  The use of bibliometric indicators to help peer-review assessment , 2009, Archivum Immunologiae et Therapiae Experimentalis.

[27]  Rob W.W. Hooft,et al.  The value of data , 2011, Nature Genetics.

[28]  Albrecht W. Hofmann,et al.  Electronic data publication in geochemistry , 2003 .

[29]  Isidro F. Aguillo Is Google Scholar useful for bibliometrics? A webometric analysis , 2012, Scientometrics.

[30]  Jens Klump,et al.  Data publication in the open access initiative , 2006, Data Sci. J..

[31]  Heather A. Piwowar,et al.  Beginning to track 1000 datasets from public repositories into the published literature , 2011, ASIST.

[32]  Peter Weingart,et al.  Impact of bibliometrics upon the science system: Inadvertent consequences? , 2005, Scientometrics.

[33]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .

[34]  Massimo Franceschet,et al.  A comparison of bibliometric indicators for computer science scholars and journals on Web of Science and Google Scholar , 2010, Scientometrics.

[35]  Micah Altman,et al.  A Proposed Standard for the Scholarly Citation of Quantitative Data , 2008 .

[36]  Heather A. Piwowar,et al.  Data reuse and the open data citation advantage , 2013, PeerJ.

[37]  D. R. Johnson,et al.  The World Ocean Database , 2013, Data Sci. J..

[38]  P. Boyd,et al.  A new database to explore the findings from large-scale ocean iron enrichment experiments , 2012 .

[39]  Mark Gahegan,et al.  Biodiversity data should be published, cited, and peer reviewed. , 2013, Trends in ecology & evolution.

[40]  Christine McGourty Dealing with the data , 1989, Nature.

[41]  Peter H. Wiebe,et al.  IOC contributions to international, interdisciplinary open data sharing , 2010 .

[42]  Vishwas Chavan,et al.  The data paper: a mechanism to incentivize data publishing in biodiversity science , 2011, BMC Bioinformatics.

[43]  Mike Thelwall,et al.  Using the Web for research evaluation: The Integrated Online Impact indicator , 2010, J. Informetrics.

[44]  S. Levitus,et al.  World ocean atlas 2013. Volume 1, Temperature , 2002 .

[45]  C Burks,et al.  Electronic data publishing and GenBank. , 1991, Science.

[46]  Giovanni Abramo,et al.  National research assessment exercises: a comparison of peer review and bibliometrics rankings , 2011, Scientometrics.

[47]  Randy Showstack,et al.  World Ocean Database , 2009 .

[48]  John Gould,et al.  Argo profiling floats bring new era of in situ ocean observations , 2004 .

[49]  H. Staudigel,et al.  Scalable models of data sharing in Earth sciences , 2003 .

[50]  Peter Ingwersen,et al.  Towards a data publishing framework for primary biodiversity data: challenges and potentials for the biodiversity informatics community , 2009, BMC Bioinformatics.

[51]  Nian Cai Liu,et al.  The Academic Ranking of World Universities. , 2005 .

[52]  Manfred Reinke,et al.  Providing global access to marine data via the World Wide Web , 1996 .

[53]  Jan Brase Using Digital Library Techniques - Registration of Scientific Primary Data , 2004, ECDL.

[54]  Peter Cornillon,et al.  The Past, Present, and Future of the AVHRR Pathfinder SST Program , 2010 .

[55]  M. Walport,et al.  Looking for Landmarks: The Role of Expert Review and Bibliometric Analysis in Evaluating Scientific Publication Outputs , 2009, PloS one.

[56]  Erik De Schutter,et al.  Data Publishing and Scientific Journals: The Future of the Scientific Paper in a World of Shared Data , 2010, Neuroinformatics.

[57]  C. Tenopir,et al.  Data Sharing by Scientists: Practices and Perceptions , 2011, PloS one.

[58]  Bradley M. Hemminger,et al.  Scientometrics 2.0: New metrics of scholarly impact on the social Web , 2010, First Monday.

[59]  Timothy P. Boyer,et al.  World ocean database 2009 , 2006 .

[60]  Mark John Costello Motivating Online Publication of Data , 2009 .

[61]  Rolf Apweiler,et al.  Make Research Data Public?---Not Always so Simple: A Dialogue for Statisticians and Science Editors , 2010, ArXiv.

[62]  Stuart Macdonald,et al.  User Engagement in Research Data Curation , 2009, ECDL.

[63]  Tiffany C. Chao Disciplinary reach: Investigating the impact of dataset reuse in the earth sciences , 2011, ASIST.

[64]  Lutz Bornmann,et al.  OPEN PEN ACCESS CCESS , 2008 .

[65]  Heather A. Piwowar,et al.  Sharing Detailed Research Data Is Associated with Increased Citation Rate , 2007, PloS one.

[66]  Ruth E. Duerr,et al.  Data Citation and Peer Review , 2010 .

[67]  Santo Fortunato,et al.  Characterizing and Modeling Citation Dynamics , 2011, PloS one.

[68]  Gudmundur A Thorisson Accreditation and attribution in data sharing , 2009, Nature Biotechnology.

[69]  Timothy P. Boyer,et al.  World ocean atlas 2013. Volume 2, Salinity , 2002 .

[70]  Ronald N. Kostoff,et al.  The unintended consequences of metrics in technology evaluation , 2007, J. Informetrics.

[71]  Thed N. van Leeuwen,et al.  The “Mendel syndrome” in science: durability of scientific literature and its effects on bibliometric analysis of individual scientists , 2011, Scientometrics.

[72]  Heather A. Piwowar,et al.  Data archiving is a good investment , 2011, Nature.

[73]  Christine L. Borgman,et al.  The conundrum of sharing research data , 2012, J. Assoc. Inf. Sci. Technol..

[74]  Kate Thomas The Past , 2015 .

[75]  Micah Altman,et al.  A Proposed Standard for the Scholarly Citation of Quantitative Data , 2008, IASSIST Conference.

[76]  David N. Weil,et al.  Full Disclosure , 1996 .

[77]  Paul E. Uhlir,et al.  For Attribution -- Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop , 2012 .