How Do Astronomers Share Data? Reliability and Persistence of Datasets Linked in AAS Publications and a Qualitative Study of Data Practices among US Astronomers

We analyze data sharing practices of astronomers over the past fifteen years. An analysis of URL links embedded in papers published by the American Astronomical Society reveals that the total number of links included in the literature rose dramatically from 1997 until 2005, when it leveled off at around 1500 per year. The analysis also shows that the availability of linked material decays with time: in 2011, 44% of links published a decade earlier, in 2001, were broken. A rough analysis of link types reveals that links to data hosted on astronomers' personal websites become unreachable much faster than links to datasets on curated institutional sites. To gauge astronomers' current data sharing practices and preferences further, we performed in-depth interviews with 12 scientists and online surveys with 173 scientists, all at a large astrophysical research institute in the United States: the Harvard-Smithsonian Center for Astrophysics, in Cambridge, MA. Both the in-depth interviews and the online survey indicate that, in principle, there is no philosophical objection to data-sharing among astronomers at this institution. Key reasons that more data are not presently shared more efficiently in astronomy include: the difficulty of sharing large data sets; over reliance on non-robust, non-reproducible mechanisms for sharing data (e.g. emailing it); unfamiliarity with options that make data-sharing easier (faster) and/or more robust; and, lastly, a sense that other researchers would not want the data to be shared. We conclude with a short discussion of a new effort to implement an easy-to-use, robust, system for data sharing in astronomy, at theastrodata.org, and we analyze the uptake of that system to-date.

[1]  Stephen S. Murray,et al.  Worldwide Use and Impact of the NASA Astrophysics Data System Digital Library , 2009, J. Assoc. Inf. Sci. Technol..

[2]  Mercè Crosas,et al.  The Dataverse Network®: An Open-Source Application for Sharing, Discovering and Preserving Data , 2011, D Lib Mag..

[3]  Board on Physics Astronomy and Astrophysics in the New Millennium , 2001 .

[4]  Paul T. Groth,et al.  Ten Simple Rules for the Care and Feeding of Scientific Data , 2014, PLoS Comput. Biol..

[5]  Robert J-P. Hauck Oh Monsieur Pasteur, We Hardly Knew You! , 1995 .

[6]  Alexander S. Szalay,et al.  The world-wide telescope , 2001, CACM.

[7]  Alberto Pepe,et al.  WorldWide Telescope in Research and Education , 2012, 1201.1285.

[8]  Gary King,et al.  An Introduction to the Dataverse Network as an Infrastructure for Data Sharing , 2007 .

[9]  Mercè Crosas,et al.  A Data Sharing Story , 2012 .

[10]  Nedjeljko Frančula The National Academies Press , 2013 .

[11]  Christine L. Borgman,et al.  The conundrum of sharing research data , 2012, J. Assoc. Inf. Sci. Technol..

[12]  Micah Altman,et al.  A Proposed Standard for the Scholarly Citation of Quantitative Data , 2008, IASSIST Conference.

[13]  Kirk D. Borne Astroinformatics: data-oriented astronomy research and education , 2010, Earth Sci. Informatics.

[14]  G. King Restructuring the Social Sciences: Reflections from Harvard's Institute for Quantitative Social Science , 2013, PS: Political Science & Politics.