Preserving the Scholarly Side of the Web

This paper presents results of a case study that addresses many issues surrounding the difficult task of preservation in a digital library. We focus on a subset of these issues as they apply to the preservation of scholarly articles encoded in Web standards. We also describe the two common preservation mechanisms, emulation and migration, as well as our selection of the latter for our particular case. Finally, we compare two approaches to migration, automatic and manual, and discuss their strengths and weaknesses in our context. We show that consistent use of open standards leads to more efficient migration processes and issue a "call to arms" to the digital preservation community to ensure that scholarly material on the Web can be preserved for future generations

[1]  Herbert Van de Sompel,et al.  The Santa Fe Convention of the Open Archives Initiative , 2000, D Lib Mag..

[2]  Catherine C. Marshall,et al.  Saving private hypertext: requirements and pragmatic dimensions for preservation , 2004, HYPERTEXT '04.

[3]  Margaret L. Hedstrom,et al.  Digital Preservation: A Time Bomb for Digital Libraries , 1997, Comput. Humanit..

[4]  Herbert Van de Sompel,et al.  The open archives initiative: building a low-barrier interoperability framework , 2001, JCDL '01.

[5]  Jeff Rothenberg,et al.  Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation. A Report to the Council on Library and Information Resources. , 1999 .

[6]  Stewart Granger,et al.  Emulation as a Digital Preservation Strategy , 2000, D Lib Mag..

[7]  Peter Jackson,et al.  Natural language processing for online applications : text retrieval, extraction and categorization , 2002 .

[8]  David M. Levy,et al.  Heroic measures: reflections on the possibility and purpose of digital preservation , 1998, DL '98.

[9]  Noah Wardrip-Fruin,et al.  First Person: New Media As Story, Performance, And Game , 2004 .

[10]  Catherine C. Marshall,et al.  Going digital: a look at assumptions underlying digital libraries , 1995, CACM.

[11]  Jeff Rothenberg,et al.  Ensuring the Longevity of Digital Documents , 1995 .

[12]  Paola Gargiulo Presentazione di SPARC Scholarly Publishing and Academic Resources Coalition , 2003 .

[13]  Alison Buckholtz Declaring Independence: Returning Scientific Publishing to Scientists , 2001 .

[14]  Matthew Chalmers,et al.  A Historical View of Context , 2004, Computer Supported Cooperative Work (CSCW).

[15]  Bailey,et al.  Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals , 2005 .

[16]  Alexa T. McCray,et al.  Principles for digital library development , 2001, CACM.

[17]  Hilary W. Poole,et al.  The Internet : a historical encyclopedia , 2005 .

[18]  Vicky Reich,et al.  LOCKSS: A Permanent Web Publishing and Access System , 2001, D Lib Mag..

[19]  Raymond A. Lorie,et al.  A methodology and system for preserving digital data , 2002, JCDL '02.

[20]  Michael Geist China and the Break-Up of the Net , 2006 .

[21]  Tim Berners-Lee,et al.  WWW: Past, Present, and Future , 1996, Computer.

[22]  G. Williams WHAT IS A DOCUMENT , 1948 .