Efficient Verification of B-tree Integrity

The integrity of B-tree structures can become compromised for many reasons. Since these inconsistencies manifest themselves in unpredictable ways, all commercial database management systems include mechanisms to verify the integrity and trustworthiness of an individual index and of a set of related indexes, and all vendors recommend index verification as part of regular database maintenance. This paper introduces algorithms for B-tree validation, reviews the algorithms’ strengths and weaknesses, and proposes a simple yet effective improvement for key verification across multiple B-tree levels. The performance is such that B-tree verification can become part of scans or backups. Our experimental comparisons include algorithm performance and scalability measured using a shipping product.

[1]  Christian Weber Ein Verfahren zur schnellen Konsistenzprüfung von Datenbanken , 1981, Angew. Inform..

[2]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[3]  Sally E. Fischbeck,et al.  The Ubiquitous B-tree: Volume II , 1987 .

[4]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[5]  James Lee Hafner,et al.  Undetected disk errors in RAID arrays , 2008, IBM J. Res. Dev..

[6]  Meikel Pöss,et al.  Data Compression in Oracle , 2003, VLDB.

[7]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[8]  H KatzRandy,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988 .

[9]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[10]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[11]  Kenneth A. Ross,et al.  Materialized view maintenance and integrity constraint checking: trading space for time , 1996, SIGMOD '96.

[12]  Michael Stonebraker,et al.  Implementation of integrity constraints and views by query modification , 1975, SIGMOD '75.

[13]  David J. DeWitt,et al.  Duplicate record elimination in large data files , 1983, TODS.

[14]  R. Bayer,et al.  Organization and maintenance of large ordered indices , 1970, SIGFIDET '70.

[15]  C. Mohan,et al.  Disk read-write optimizations and data integrity in transaction systems using write-ahead logging , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  Rudolf Bayer,et al.  Prefix B-trees , 1977, TODS.

[17]  S. B. Yao,et al.  Efficient locking for concurrent operations on B-trees , 1981, TODS.

[18]  C. Mohan,et al.  Algorithms for Flexible Space Management in Transaction Systems Supporting Fine-Granularity Locking , 1994, EDBT.

[19]  Klaus Küspert Fehlererkennung und Fehlerbehandlung in Speicherungsstrukturen von Datenbanksystemen , 1985, Informatik-Fachberichte.

[20]  Ramana Yerneni,et al.  Efficient Testing of High Performance Transaction Processing Systems , 1997, VLDB.

[21]  Andrea C. Arpaci-Dusseau,et al.  An analysis of data corruption in the storage stack , 2008, TOS.