Astrolabe: Curating, Linking, and Computing Astronomy’s Dark Data

Where appropriate repositories are not available to support all relevant astronomical data products, data can fall into darkness: unseen and unavailable for future reference and re-use. Some data in this category are legacy or old data, but newer datasets are also often uncurated and could remain "dark". This paper provides a description of the design motivation and development of Astrolabe, a cyberinfrastructure project that addresses a set of community recommendations for locating and ensuring the long-term curation of dark or otherwise at-risk data and integrated computing. This paper also describes the outcomes of the series of community workshops that informed creation of Astrolabe. According to participants in these workshops, much astronomical dark data currently exist that are not curated elsewhere, as well as software that can only be executed by a few individuals and therefore becomes unusable because of changes in computing platforms. Astronomical research questions and challenges would be better addressed with integrated data and computational resources that fall outside the scope of existing observatory and space mission projects. As a solution, the design of the Astrolabe system is aimed at developing new resources for management of astronomical data. The project is based in CyVerse cyberinfrastructure technology and is a collaboration between the University of Arizona and the American Astronomical Society. Overall the project aims to support open access to research data by leveraging existing cyberinfrastructure resources and promoting scientific discovery by making potentially-useful data in a computable format broadly available to the astronomical community.

[1]  Alexander S. Szalay,et al.  Virtual Observatories of the Future , 2001 .

[2]  F. Hasson,et al.  A critical review of the Delphi technique as a research methodology for nursing. , 2001, International journal of nursing studies.

[3]  Christine L Borgman,et al.  Science friction: Data, metadata, and collaboration , 2011, Social studies of science.

[4]  Alberto Accomazzi,et al.  Linking to Data - Effect on Citation Rates in Astronomy , 2011, ArXiv.

[5]  Gerd Heber,et al.  An overview of the HDF5 technology suite and its applications , 2011, AD '11.

[6]  Harland W. Epps Giant Telescopes: Astronomical Ambition and the Promise of Technology , 2005 .

[7]  Ana Peraica,et al.  Big Data, Little Data, No Data: Scholarship in the Networked World , 2016, Leonardo.

[8]  Alberto Accomazzi,et al.  Linking Literature and Data: Status Report and Future Efforts , 2011, ArXiv.

[9]  W. Patrick McCray,et al.  How Astronomers Digitized the Sky , 2014 .

[10]  Edwin A. Henneken Unlocking and sharing data in astronomy , 2015 .

[11]  Brian A. Nosek,et al.  An Open, Large-Scale, Collaborative Effort to Estimate the Reproducibility of Psychological Science , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[12]  Peter T. Darch,et al.  Data Management in the Long Tail: Science, Software, and Service , 2016, Int. J. Digit. Curation.

[13]  E. J. Los,et al.  An ultrahigh-speed digitizer for the Harvard College Observatory astronomical plates , 2006, SPIE Optics + Photonics.

[14]  Alberto Accomazzi,et al.  Semantic Interlinking of Resources in the Virtual Observatory Era , 2011, ArXiv.

[15]  Daniel Atkins,et al.  Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure , 2003 .

[16]  Anders Sparre Conrad,et al.  Reuse for Research: Curating Astrophysical Datasets for Future Researchers , 2017, Int. J. Digit. Curation.

[17]  Christine L. Borgman,et al.  We're Working On It: Transferring the Sloan Digital Sky Survey from Laboratory to Library , 2014, Int. J. Digit. Curation.

[18]  Michael Droettboom,et al.  ASDF: A new data format for astronomy , 2015 .

[19]  Edwin Henneken It specialist Unlocking and sharing data in astronomy , 2015 .

[20]  P. Bryan Heidorn,et al.  Shedding Light on the Dark Data in the Long Tail of Science , 2008, Libr. Trends.

[21]  Alberto Accomazzi,et al.  The Unified Astronomy Thesaurus: Semantic Metadata for Astronomy and Astrophysics , 2018, ArXiv.

[22]  William K. Michener,et al.  NONGEOSPATIAL METADATA FOR THE ECOLOGICAL SCIENCES , 1997 .

[23]  Preben Grosbol The FITS Data Format , 1988 .

[24]  Alexander S. Szalay,et al.  Digital Data Preservation for Scholarly Publications in Astronomy , 2008, Int. J. Digit. Curation.

[25]  Chenzhou Cui,et al.  AAS WorldWide Telescope: A Seamless, Cross-platform Data Visualization Engine for Astronomy Research, Education, and Democratizing Data , 2018, 1801.09119.

[26]  Ian T. Foster,et al.  Jetstream: a self-provisioned, scalable science and engineering cloud environment , 2015, XSEDE.

[27]  Scott W. Fleming,et al.  A Model for Data Citation in Astronomical Research Using Digital Object Identifiers (DOIs) , 2017, ArXiv.

[28]  J. Avery,et al.  The long tail. , 1995, Journal of the Tennessee Medical Association.

[29]  Nancy Wilkins-Diehr,et al.  XSEDE: Accelerating Scientific Discovery , 2014, Computing in Science & Engineering.

[30]  Alberto Accomazzi Future Professional Communication in Astronomy II , 2011 .

[31]  Alberto Accomazzi,et al.  Aggregation and Linking of Observational Metadata in the ADS , 2016, ArXiv.

[32]  Robert A. Shaw,et al.  Astronomical data analysis software and systems IV : meeting held at Baltimore, Maryland, 25-28 September 1994 , 1995 .

[33]  Alberto Accomazzi,et al.  Asclepias - Capturing Software Citations In Astronomy , 2017 .

[34]  Michael Witt,et al.  Data sharing, small science and institutional repositories , 2010, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[35]  M. Shermer,et al.  Science Friction , 2004 .