Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals

Journal policy on research data and code availability is an important part of the ongoing shift toward publishing reproducible computational science. This article extends the literature by studying journal data sharing policies by year (for both 2011 and 2012) for a referent set of 170 journals. We make a further contribution by evaluating code sharing policies, supplemental materials policies, and open access status for these 170 journals for each of 2011 and 2012. We build a predictive model of open data and code policy adoption as a function of impact factor and publisher and find higher impact journals more likely to have open data and code policies and scientific societies more likely to have open data and code policies than commercial publishers. We also find open data policies tend to lead open code policies, and we find no relationship between open data and code policies and either supplemental material policies or open access journal status. Of the journals in this study, 38% had a data policy, 22% had a code policy, and 66% had a supplemental materials policy as of June 2012. This reflects a striking one year increase of 16% in the number of data policies, a 30% increase in code policies, and a 7% increase in the number of supplemental materials policies. We introduce a new dataset to the community that categorizes data and code sharing, supplemental materials, and open access policies in 2011 and 2012 for these 170 journals.

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  P. Parseval,et al.  Structure of the { 001 } talc surface as seen by atomic force 1 microscopy : Comparison with X-ray and electron diffraction 2 results 3 4 , 2006 .

[3]  Robert J-P. Hauck Oh Monsieur Pasteur, We Hardly Knew You! , 1995 .

[4]  Stephen M. Bainbridge The gatekeepers. , 1998, Nature structural biology.

[5]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[6]  Chih Jeng Kenneth Tan Computational science , 2002, Future Gener. Comput. Syst..

[7]  Georgina Ferry,et al.  The Common Thread , 2002 .

[8]  I. Verma The Common Thread: A Story of Science, Politics, Ethics and the Human Genome , 2002, Nature Medicine.

[9]  S. Eddy,et al.  Sharing Publication-Related Data and Materials: Responsibilities of Authorship in the Life Sciences1 , 2003, Plant Physiology.

[10]  Division on Earth Sharing Publication-Related Data and Materials:: Responsibilities of Authorship in the Life Sciences , 2003 .

[11]  John R. Durant ‘THE COMMON THREAD: A Story of Science, Politics, Ethics, and the Human Genome’ , 2003 .

[12]  Christopher P Austin,et al.  Prepublication data sharing , 2009, Nature.

[13]  klaguia Prepublication Data Sharing , 2009 .

[14]  Arian Maleki,et al.  Reproducible Research in Computational Harmonic Analysis , 2009, Computing in Science & Engineering.

[15]  Victoria Stodden,et al.  Reproducible Research , 2019, The New Statistics with R.

[16]  Nick Barnes Publish your computer code: it is good enough , 2010, Nature.

[17]  George K. Thiruvathukal Your Local Cloud-Enabled Library , 2010, Comput. Sci. Eng..

[18]  Jill P Mesirov,et al.  Accessible Reproducible Research , 2010, Science.

[19]  Z. Merali Computational science: ...Error , 2010, Nature.

[20]  J. Wicherts,et al.  Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results , 2011, PloS one.

[21]  J. Ioannidis,et al.  Public Availability of Published Research Data in High-Impact Journals , 2011, PloS one.

[22]  N. Lemke,et al.  Using Amino Acid Correlation and Community Detection Algorithms to Identify Functional Determinants in Protein Families , 2011, PloS one.

[23]  Bruce Alberts,et al.  Making Data Maximally Available , 2011, Science.

[24]  Paulo Mazzafera,et al.  An Arabidopsis Mitochondrial Uncoupling Protein Confers Tolerance to Drought and Salt Stress in Transgenic Tobacco Plants , 2011, PloS one.

[25]  Brian A. Nosek,et al.  Scientific Utopia: I. Opening Scientific Communication , 2012, ArXiv.

[26]  Brian A. Nosek,et al.  Scientific Utopia , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[27]  Patrick Vandewalle Code Sharing Is Associated with Research Impact in Image Processing , 2012, Computing in Science & Engineering.

[28]  Kevin M. Simmons,et al.  Lessons Learned and the Path Forward , 2012 .

[29]  Kurt Gray,et al.  Psychological Inquiry: An International Journal for the Advancement of Psychological Theory , 2012 .

[30]  Ian M. Mitchell,et al.  Reproducible research for scientific computing: Tools and strategies for changing the culture , 2012, Computing in Science & Engineering.

[31]  Paul McKellips,et al.  Good enough , 2013, Lab Animal.

[32]  G. Omenn,et al.  Evolution of Translational Omics: Lessons Learned and the Path Forward , 2013 .