Citation and Peer Review of Data: Moving Towards Formal Data Publication

This paper discusses many of the issues associated with formally publishing data in academia, focusing primarily on the structures that need to be put in place for peer review and formal citation of datasets. Data publication is becoming increasingly important to the scientific community, as it will provide a mechanism for those who create data to receive academic credit for their work and will allow the conclusions arising from an analysis to be more readily verifiable, thus promoting transparency in the scientific process. Peer review of data will also provide a mechanism for ensuring the quality of datasets, and we provide suggestions on the types of activities one expects to see in the peer review of data. A simple taxonomy of data publication methodologies is presented and evaluated, and the paper concludes with a discussion of dataset granularity, transience and semantics, along with a recommended human-readable citation syntax.

[1]  Mark John Costello Motivating Online Publication of Data , 2009 .

[2]  Peter Buneman,et al.  How to cite curated databases and how to make them citable , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[3]  Peter Buneman,et al.  A Rule-Based Citation System for Structured and Evolving Datasets , 2010, IEEE Data Eng. Bull..

[4]  Jens Klump,et al.  Data publication in the open access initiative , 2006, Data Sci. J..

[5]  PlaleBeth,et al.  A survey of data provenance in e-science , 2005 .

[6]  William S Hancock,et al.  Publishing large proteome datasets: scientific policy meets emerging technologies. , 2002, Trends in biotechnology.

[7]  Geoffrey C. Bowker,et al.  Promoting Access to Public Research Data for Scientific, Economic, and Social Development , 2004, Data Sci. J..

[8]  P. Geurts,et al.  Forces and functions in scientific communication: an analysis of their interplay , 1997 .

[9]  Ann C. Schaffer,et al.  The Future of Scientific Journals : Lessons from the Past , 1995 .

[10]  Sara Schroter,et al.  From submission to publication: a retrospective review of the tables and figures in a cohort of randomized controlled trials submitted to the British Medical Journal. , 2006, Annals of emergency medicine.

[11]  Charlotte Waelde Public Domain; Public Interest; Public Funding: Focussing on the three Ps in Scientific Research , 2005 .

[12]  C. Rusbridge,et al.  The International Journal of Digital Curation , 2008 .

[13]  Joan C. Bartlett,et al.  The role of "unpublished" research in the scholarly communication of scientists: Digital preprints and bioinformation databases. Sponsored by SIG STI, SIG BIO, SIG PUB , 2002, ASIST.

[14]  R Lowry,et al.  Information in environmental data grids , 2008, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[15]  Cecelia M. Brown The changing face of scientific discourse: Analysis of genomic and proteomic database usage and acceptance , 2003, J. Assoc. Inf. Sci. Technol..

[16]  Herbert Van de Sompel,et al.  Rethinking Scholarly Communication: Building the System that Scholars Deserve , 2004, D Lib Mag..

[17]  Magnus Enger The concept of 'overlay' in relation to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) , 2005 .

[18]  J. Armstrong,et al.  Peer review for journals: Evidence on quality control, fairness, and innovation , 1997 .

[19]  Sarah Callaghan,et al.  Overlay Journals and Data Publishing in the Meteorological Sciences , 2009 .

[20]  P. Ginsparg Winners and Losers in the Global Research Village , 1997 .

[21]  D. C. Koningsberger Report on Activities of Committee on Standards and Criteria in XAFS Spectroscopy , 1993 .

[22]  Herbert Van de Sompel,et al.  An Interoperable Fabric for Scholarly Value Chains , 2006, D-Lib Magazine.

[23]  Chestalene Pintozzi Every Librarian a Leader: Rethinking scholarly communication , 1996 .

[24]  Uwe Schindler,et al.  The publication of scientific data by World Data Centers and the National Library of Science and Technology in Germany , 2006, Data Sci. J..

[25]  Alexander S. Szalay,et al.  Online scientific data curation, publication, and archiving , 2002, SPIE Astronomical Telescopes + Instrumentation.

[26]  Hugh D. Wilson,et al.  Informatics: new media and paths of data flow , 2001 .

[27]  Lisa A. Ennis The access principle: The case for open access to research and scholarship , 2007, J. Assoc. Inf. Sci. Technol..

[28]  Sarah Callaghan,et al.  How to Publish Data Using Overlay Journals: The OJIMS Project , 2009 .

[29]  Timothy R. Carr,et al.  The future of scientific communication in the earth sciences: the impact of the Internet , 1997 .

[30]  Anita de Waard,et al.  A pragmatic structure for research articles , 2007, ICPW '07.

[31]  Diane M. Strong,et al.  Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..