LH*G: A High-Availability Scalable Distributed Data Structure By Record Grouping

LH*g (Linear Hashing by grouping) is a high-availability extension of the LH* scalable distributed data structure. An LH*g file scales up with constant key search and insert performance, while surviving any single-site unavailability (failure). We achieve high availability through a new principle of record grouping. A group is a logical structure of up to k records, where k is a file parameter. Every group contains a parity record allowing for the reconstruction of an unavailable member. The basic scheme may be generalized to support the unavailability of any number of sites, at the expense of storage and messaging. Other known high-availability schemes are static, or require more storage, or provide worse search performance.

[1]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[2]  Witold Litwin,et al.  High-availability LH* schemes with mirroring , 1996, Proceedings First IFCIS International Conference on Cooperative Information Systems.

[3]  Peter Widmayer,et al.  Distributing a search tree among a growing number of processors , 1994, SIGMOD '94.

[4]  J-C. Laprie,et al.  DEPENDABLE COMPUTING AND FAULT TOLERANCE : CONCEPTS AND TERMINOLOGY , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[5]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[6]  Thomas Schwarz,et al.  LH*RS: a high-availability scalable distributed data structure using Reed Solomon Codes , 2000, SIGMOD 2000.

[7]  Theodore Johnson,et al.  Lazy updates for distributed search structure , 1993, SIGMOD Conference.

[8]  Carl Staelin,et al.  The HP AutoRAID hierarchical storage system , 1995, SOSP.

[9]  Michael Stonebraker,et al.  Distributed RAID-a new multiple copy algorithm , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[10]  G. A. Alvarez,et al.  Tolerating Multiple Failures In Raid Architectures With Optimal Storage And Uniform Declustering , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[11]  Witold Litwin,et al.  k-RP*s: a scalable distributed data structure for high-performance multi-attribute access , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[12]  Jim Gray Super Servers: Commodity Computer Clusters Pose a Software Challenge , 1995, BTW.

[13]  Robert Devine,et al.  Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm , 1993, FODO.

[14]  John H. Hartman,et al.  Zebra: A Striped Network File System , 1992 .

[15]  Gerhard Weikum,et al.  Distributed file organization with scalable cost/performance , 1994, SIGMOD '94.

[16]  Jeffrey D. Ullman,et al.  New Frontiers in Database System Research , 1992, 25th Anniversary of INRIA.

[17]  Minesh B. Amin,et al.  An Adaptive, Load Balancing Parallel Join Algorithm , 1994, COMAD.

[18]  Donald E. Knuth,et al.  Sorting and Searching , 1973 .

[19]  John H. Hartman,et al.  The Zebra striped network file system , 1995, TOCS.

[20]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[21]  Ronny Lindberg,et al.  A Java Implementation of a Highly Available, Scalable and Distributed Data Structure, LH*g , 1997 .

[22]  Witold Litwin,et al.  LH* - Linear Hashing for Distributed Files , 1993, SIGMOD Conference.

[23]  Tore Risch,et al.  LH*LH: A scalable High Performance Data Structure for Switched Multicomputers , 1996, EDBT.

[24]  Witold Litwin,et al.  RP*: A Family of Order Preserving Scalable Distributed Data Structures , 1994, VLDB.

[25]  David J. DeWitt,et al.  Chained declustering: a new availability strategy for multiprocessor database machines , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[26]  Witold Litwin,et al.  LH*RS: a high-availability scalable distributed data structure using Reed Solomon Codes , 2000, SIGMOD '00.

[27]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[28]  Witold Litwin,et al.  LH*s: a high-availability and high-security scalable distributed data structure , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[29]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[30]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[31]  Tore Risch,et al.  Design Issues for Scalable Availability LH* Schemes with Recor Grouping , 1999, WDAS.

[32]  Tore Risch,et al.  LH* Schemes with Scalable Availability , 1998 .

[33]  Øystein Torbjørnsen,et al.  Multi-Site Declustering Strategies for Very High Database Service Availability , 1995 .

[34]  Sakti Pramanik,et al.  Distributed Linear Hashing and Parallel Projection in Main Memory Databases , 1990, VLDB.

[35]  Andrew S. Tanenbaum,et al.  Distributed operating systems , 2009, CSUR.