Algebraic signatures for scalable distributed data structures

Signatures detect changes to data objects. Numerous schemes are in use, especially the cryptographically secure standards SHA-1. We propose a novel signature scheme which we call algebraic signatures. The scheme uses the Galois field calculations. Its major property is the sure detection of any changes up to a parameterized size. More precisely, we detect for sure any changes that do not exceed n-symbols for an n-symbol algebraic signature. This property is new for any known signature scheme. For larger changes, the collision probability is typically negligible, as for the other known schemes. We apply the algebraic signatures to the scalable distributed data structures (SDDS). We filter at the SDDS client node the updates that do not actually change the records. We also manage the concurrent updates to data stored in the SDDS RAM buckets at the server nodes. We further use the scheme for the fast disk backup of these buckets. We sign our objects with 4-byte signatures, instead of 20-byte standard SHA-1 signatures. Our algebraic calculus is then also about twice as fast.

[1]  Tore Risch,et al.  AMOS-SDDS: A Scalable Distributed Data Manager for Windows Multicomputers , 2001, PDCS.

[2]  Robert W. Bowdidge,et al.  Low cost comparisons of file copies , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[3]  Michael Kifer,et al.  Databases and Transaction Processing: An Application-Oriented Approach , 2001 .

[4]  Forouzan Golshani,et al.  Proceedings of the Eighth International Conference on Data Engineering , 1992 .

[5]  Ronald Fagin,et al.  Compactly encoding unstructured inputs with differential compression , 2002, JACM.

[6]  Christos Faloutsos,et al.  Optimal signature extraction and information loss , 1987, TODS.

[7]  Khaled A. S. Abdel-Ghaffar,et al.  Efficient detection of corrupted pages in a replicated file , 1993, PODC '93.

[8]  Witold Litwin,et al.  Disk Backup Through Algebraic Signatures in Scalable Distributed Data Structures , 2006 .

[9]  KiferMichael,et al.  Databases and transaction processing , 2002 .

[10]  Jae-Woo Chang,et al.  A new parallel signature file method for efficient information retrieval , 1995, CIKM '95.

[11]  Thomas Schwarz,et al.  LH*RS: a high-availability scalable distributed data structure using Reed Solomon Codes , 2000, SIGMOD 2000.

[12]  Tore Risch,et al.  Scalable Distributed Data Structures for High-Performance Databases , 2000, WDAS.

[13]  Witold Litwin,et al.  LH*RS: a high-availability scalable distributed data structure using Reed Solomon Codes , 2000, SIGMOD '00.

[14]  Richard J. Lipton,et al.  A Class of Randomized Strategies for Low-Cost Comparison of File Copies , 1991, IEEE Trans. Parallel Distributed Syst..

[15]  Bernard P. Zajac Applied cryptography: Protocols, algorithms, and source code in C , 1994 .

[16]  John J. Metzner,et al.  A Parity Structure for Large Remotely Located Replicated Data Files , 1983, IEEE Transactions on Computers.

[17]  Witold Litwin,et al.  RP: A Family of Order Preserving Scalable , 1994 .

[18]  Radia Perlman,et al.  Network Security , 2002 .

[19]  Tolga Acar,et al.  Managing System and Active-Content Integrity , 2000, Computer.

[20]  Peter Scheuermann,et al.  Evicting SDDS-2000 Buckets in RAM to the Disk , 2002 .

[21]  Jonathan D. Cohen,et al.  Recursive hashing functions for n-grams , 1997, TOIS.

[22]  A. Broder Some applications of Rabin’s fingerprinting method , 1993 .

[23]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[24]  Fazli Can,et al.  Compressed multi-framed signature files: an index structure for fast information retrieval , 1999, SAC '99.

[25]  Richard M. Karp,et al.  Efficient Randomized Pattern-Matching Algorithms , 1987, IBM J. Res. Dev..

[26]  Witold Litwin,et al.  RP*: A Family of Order Preserving Scalable Distributed Data Structures , 1994, VLDB.

[27]  Radia J. Perlman,et al.  Network security - private communication in a public world , 2002, Prentice Hall series in computer networking and distributed systems.

[28]  Kun-Lung Wu,et al.  Low-Cost Comparison and Diagnosis of Large Remotely Located Files , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[29]  Dieter Gollmann,et al.  Computer Security , 1979, Lecture Notes in Computer Science.

[30]  Tore Risch,et al.  An Architecture for a Scalable Distributed DBS: Application to SQL Server 2000 , 2002 .

[31]  Hector Garcia-Molina,et al.  Exploiting symmetries for low-cost comparison of file copies , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[32]  Scott A. Brandt,et al.  Reliability mechanisms for very large storage systems , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..