Archiving Software Surrogates on the Web for Future Reference

Software has long been established as an essential aspect of the scientific process in mathematics and other disciplines. However, reliably referencing software in scientific publications is still challenging for various reasons. A crucial factor is that software dynamics with temporal versions or states are difficult to capture over time. We propose to archive and reference surrogates instead, which can be found on the Web and reflect the actual software to a remarkable extent. Our study shows that about a half of the webpages of software are already archived with almost all of them including some kind of documentation.

[1]  Frank M. Shipman,et al.  An argument for archiving Facebook as a heterogeneous personal store , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[2]  Wolfram Sperber,et al.  swMATH - An Information Service for Mathematical Software , 2014, ICMS.

[3]  Carole Goble,et al.  The evolution of standards and data management practices in systems biology , 2015, Molecular systems biology.

[4]  Wolfgang Nejdl,et al.  On the Applicability of Delicious for Temporal Search on Web Archives , 2016, SIGIR.

[5]  Paul Lindner Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost? , 2016 .

[6]  Nikos Kasioumis,et al.  Towards building a blog preservation platform , 2014, World Wide Web.

[7]  Herbert Van de Sompel,et al.  Web Archive Profiling Through CDX Summarization , 2015, TPDL.

[8]  Peter Schirmbacher,et al.  Making Research Data Repositories Visible: The re3data.org Registry , 2013, PloS one.

[9]  Ian M. Mitchell,et al.  Best Practices for Scientific Computing , 2012, PLoS biology.

[10]  Miguel Costa,et al.  The Importance of Web Archives for Humanities , 2014, Int. J. Humanit. Arts Comput..

[11]  Michael J. Day,et al.  Implementing Digital Preservation Strategy: Developing content collection profiles at the British Library , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[12]  D. Giaretta,et al.  The Digital Curation Centre: a vision for digital curation , 2005, 2005 IEEE International Symposium on Mass Storage Systems and Technology.

[13]  R. Peng Reproducible Research in Computational Science , 2011, Science.

[14]  Michael L. Nelson,et al.  How much of the web is archived? , 2011, JCDL '11.

[15]  Andrea Porzel,et al.  The RADAR Project - A Service for Research Data Archival and Publication , 2016, ISPRS Int. J. Geo Inf..

[16]  Carole A. Goble,et al.  Better Software, Better Research , 2014, IEEE Internet Comput..

[17]  Christian S. Collberg,et al.  Repeatability in computer systems research , 2016, Commun. ACM.

[18]  Avishek Anand,et al.  ArchiveSpark: Efficient Web archive access, extraction and derivation , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[19]  Helen Hockx-Yu,et al.  Access and Scholarly Use of Web Archives , 2014 .

[20]  Michael L. Nelson,et al.  How Well Are Arabic Websites Archived? , 2015, JCDL.

[21]  Kate E Decleene,et al.  Publication Manual of the American Psychological Association , 2011 .

[22]  Joseph Gibaldi MLA style manual and guide to scholarly publishing , 1999 .

[23]  Stuart Macdonald Edinburgh DataShare - A DSpace Data Repository: Achievements and Aspirations , 2009 .

[24]  Wolfgang Nejdl,et al.  The Dawn of today's popular domains: A study of the archived German Web over 18 years , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[25]  Avishek Anand,et al.  Tempas: Temporal Archive Search Based on Tags , 2016, WWW.

[26]  Herbert Van de Sompel,et al.  Profiling web archive coverage for top-level domain and content language , 2013, International Journal on Digital Libraries.