A Survey on Web Archiving Initiatives

Web archiving has been gaining interest and recognized importance for modern societies around the world. However, for web archivists it is frequently difficult to demonstrate this fact, for instance, to funders. This study provides an updated and global overview of web archiving. The obtained results showed that the number of web archiving initiatives significantly grew after 2003 and they are concentrated on developed countries. We statistically analyzed metrics, such as, the volume of archived data, archive file formats or number of people engaged. Web archives all together must process more data than any web search engine. Considering the complexity and large amounts of data involved in web archiving, the results showed that the assigned resources are scarce. A Wikipedia page was created to complement the presented work and be collaboratively kept up-to-date by the community.

[1]  João Miranda,et al.  Trends in Web Characteristics , 2009, 2009 Latin American Web Congress.

[2]  Bas Savenije,et al.  The National Library of the Netherlands , 2009 .

[3]  Jaroslav Kortus PADI: Preserving Access to Digital Information , 2005 .

[4]  Gernot U. Gabel,et al.  Harvard University Libraries , 2008 .

[5]  Andrew J Charlesworth,et al.  Legal issues relating to the archiving of Internet resources in the UK, EU, US and Australia , 2003 .

[6]  BaiXuehua National Library of China , 2003 .

[7]  Julien Masanès,et al.  Web Archiving , 2014, Encyclopedia of Social Network Analysis and Mining.

[8]  Åsa Andersson Swedish Websites - Kungliga biblioteket , 2008 .

[9]  Wilmer Cutler,et al.  United States Securities and Exchange Commission Form 10-K Annual Report , 2012 .

[10]  Charles W. Lamden The Securities and Exchange Commission , 1978 .

[11]  Diomidis Spinellis,et al.  The decay and failures of web references , 2003, CACM.

[12]  Christopher Olston,et al.  What's new on the web?: the evolution of the web from a search engine perspective , 2004, WWW '04.

[13]  Arthur Thomas,et al.  Researcher Engagement with Web Archives: State of the Art , 2010 .

[14]  Oksana P. Soldatkina Национальная библиотека Чешской Республики , 2012 .

[15]  M. Katogi,et al.  National Diet Library , 1994 .

[16]  Maurice F. Tauber,et al.  The Columbia University Libraries , 1959 .

[17]  Wilson Frost University of Hawaii at Manoa Library , 2002 .

[18]  Tamara Eisenschitz,et al.  Role and justification of web archiving by national libraries , 2009, J. Libr. Inf. Sci..

[19]  Irina Kubadinow The Austrian National Library , 2004 .

[20]  Preservation of digital heritage , 2008 .

[21]  Michael Day,et al.  Collecting and preserving the world wide web , 2003 .

[22]  Gunnar Sahlin,et al.  The National Library of Sweden , 2011 .

[23]  Kai Ekholm,et al.  The National Library of Finland , 2009 .

[24]  Luis García Ejarque Biblioteca Nacional de España , 1992 .