LH*s: a high-availability and high-security scalable distributed data structure

LH*s is high availability variant of LH*, a Scalable Distributed Data Structure. An LH*s record is striped onto different server nodes. A parity segment allows one to reconstruct the record if a segment fails. The insert or key search time is about a msec on a 10 Mb/s net, and about 100 /spl mu/s at 1 Gb/s net, assuming the segments in the distributed RAM. The file size depends only on the distributed storage available, i.e., a RAM file can reach dozens of GB in practice. Data security is enhanced, as every site contains only partial and typically meaningless data. The price to pay is 20-50% more storage for the file than for an LH* file, and some additional messaging, especially for the scan search.

[1]  Witold Litwin,et al.  k-RP*s: a scalable distributed data structure for high-performance multi-attribute access , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[2]  Jim Gray Super Servers: Commodity Computer Clusters Pose a Software Challenge , 1995, BTW.

[3]  Andrew S. Tanenbaum,et al.  Distributed operating systems , 2009, CSUR.

[4]  Peter Widmayer,et al.  Distributing a search tree among a growing number of processors , 1994, SIGMOD '94.

[5]  Øystein Torbjørnsen,et al.  Multi-Site Declustering Strategies for Very High Database Service Availability , 1995 .

[6]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[7]  Witold Litwin,et al.  High-availability LH* schemes with mirroring , 1996, Proceedings First IFCIS International Conference on Cooperative Information Systems.

[8]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[9]  Brian Randell,et al.  System Dependability , 1992, 25th Anniversary of INRIA.

[10]  John H. Hartman,et al.  The Zebra striped network file system , 1995, TOCS.

[11]  Minesh B. Amin,et al.  An Adaptive, Load Balancing Parallel Join Algorithm , 1994, COMAD.

[12]  Michael Stonebraker,et al.  Distributed RAID-a new multiple copy algorithm , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[13]  Alain Bensoussan,et al.  Future Tendencies in Computer Science, Control and Applied Mathematics , 1992, Lecture Notes in Computer Science.

[14]  John H. Hartman,et al.  Zebra: A Striped Network File System , 1992 .

[15]  Gerhard Weikum,et al.  Distributed file organization with scalable cost/performance , 1994, SIGMOD '94.

[16]  Witold Litwin,et al.  RP*: A Family of Order Preserving Scalable Distributed Data Structures , 1994, VLDB.

[17]  Witold Litwin,et al.  LH* - Linear Hashing for Distributed Files , 1993, SIGMOD Conference.