Bit Preservation: A Solved Problem?

For years, discussions of digital preservation have routinely featured comments such as “bit preservation is a solved problem; the real issues are … ”. Indeed, current digital storage technologies are not just astoundingly cheap and capacious, they are astonishingly reliable. Unfortunately, these attributes drive a kind of “Parkinson’s Law” of storage, in which demands continually push beyond the capabilities of systems implementable at an affordable price. This paper is in four parts: Claims, reviewing a typical claim of storage system reliability, showing that it provides no useful information for bit preservation purposes. Theory, proposing “bit half-life” as an initial, if inadequate, measure of bit preservation performance, expressing bit preservation requirements in terms of it, and showing that the requirements being placed on bit preservation systems are so onerous that the experiments required to prove that a solution exists are not feasible. Practice, reviewing recent research into how well actual storage systems preserve bits, showing that they fail to meet the requirements by many orders of magnitude. Policy, suggesting ways of dealing with this unfortunate situation. 1

[1]  Vicky Reich,et al.  Requirements for Digital Preservation Systems: A Bottom-Up Approach , 2005, D Lib Mag..

[2]  Bruce Margony The Sloan Digital Sky Survey , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[3]  Ethan L. Miller,et al.  Pergamum: Replacing Tape with Energy Efficient, Reliable, Disk-Based Archival Storage , 2008, FAST.

[4]  V. Heiselberg,et al.  [PARKINSON'S LAW]. , 1964, Yngre laeger.

[5]  C. Rusbridge,et al.  The International Journal of Digital Curation , 2008 .

[6]  Jon G. Elerath,et al.  Hard-disk drives: the good, the bad, and the ugly , 2009, CACM.

[7]  Garth A. Gibson,et al.  The Computer Failure Data Repository ( CFDR ) , 2006 .

[8]  Carl Eklund,et al.  National Institute for Standards and Technology , 2009, Encyclopedia of Biometrics.

[9]  Arkady Kanevsky,et al.  Are disks the dominant contributor for storage failures?: A comprehensive study of storage subsystem failure characteristics , 2008, TOS.

[10]  Jon G. Elerath Hard Disk Drives: The Good, the Bad and the Ugly! , 2007, ACM Queue.

[11]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[12]  Annika Riekkola A study of Thomas Tidholm's translation of The Hitch Hiker's Guide to the Galaxy , 2005 .

[13]  Karl R. Popper The Logic of Scientific Discovery. , 1977 .

[14]  Andrea C. Arpaci-Dusseau,et al.  IRON file systems , 2005, SOSP '05.

[15]  Constantinos E. Goutis,et al.  A low-power and high-throughput implementation of the SHA-1 hash function , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[16]  尚弘 島影 National Institute of Standards and Technologyにおける超伝導研究及び生活 , 2001 .

[17]  David A. Patterson,et al.  Characterizing large storage systems: error behavior and performance benchmarks , 1999 .

[18]  Paul Williams,et al.  Predicting Archival Life of Removable Hard Disk Drives , 2008, Archiving Conference.

[19]  Robert H. McDonald,et al.  Disk and Tape Storage Cost Models , 2007, Archiving Conference.

[20]  Michael G. Pecht,et al.  Enhanced Reliability Modeling of RAID Storage Systems , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[21]  Clayton M. Christensen The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail , 2013 .

[22]  Andrea C. Arpaci-Dusseau,et al.  Parity Lost and Parity Regained , 2008, FAST.

[23]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[24]  Alexander S. Szalay,et al.  The Sloan Digital Sky Survey , 1999, Comput. Sci. Eng..

[25]  Andrea C. Arpaci-Dusseau,et al.  An analysis of data corruption in the storage stack , 2008, TOS.

[26]  E. G. Lyman,et al.  NASA aviation safety reporting system , 1976 .

[27]  W. D. Reynard,et al.  The aviation safety reporting system , 1984 .

[28]  James Lee Hafner,et al.  Undetected disk errors in RAID arrays , 2008, IBM J. Res. Dev..

[29]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[30]  Dawson R. Engler A System's Hackers Crash Course: Techniques that Find Lots of Bugs in Real (Storage) System Code , 2007, FAST.

[31]  Helen M. Berman,et al.  The Worldwide Protein Data Bank , 2012 .

[32]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[33]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[34]  Mary Baker,et al.  A fresh look at the reliability of long-term digital storage , 2005, EuroSys.

[35]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.