Delta storage for arbitrary non-text files

The storage of different versions of documents is an essential part of each software version and configuration control tool. To achieve this goal without wasting an enormous amount of disk space, delta storage is a commonly-used technique. At the time being, most of the existing version control tools offer delta storage for text files only. As not all files of a software project are plain text files, delta storage should be supported for arbitrary non-text files too. This paper deals with a technique for generating deltas between arbitrary files without any presumptions about the file structure. After a brief discussion of commonly-used delta techniques for line-structured text files, the algorithm to generate deltas between arbitrary non-text files is presented. Measurements of exwution time and the size of resulting delta scripts conclude the paper.

[1]  Walter F. Tichy,et al.  The string-to-string correction problem with block moves , 1984, TOCS.

[2]  R. J. Joenk,et al.  IBM journal of research and development: information for authors , 1978 .

[3]  Walter F. Tichy,et al.  Rcs — a system for version control , 1985, Softw. Pract. Exp..

[4]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[5]  Marc J. Rochkind,et al.  The source code control system , 1975, IEEE Transactions on Software Engineering.

[6]  Paul Heckel,et al.  A technique for isolating differences between files , 1978, CACM.

[7]  F TichyWalter The string-to-string correction problem with block moves , 1984 .

[8]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[9]  David B. Leblang,et al.  Computer-Aided Software Engineering in a distributed workstation environment , 1984, SDE 1.

[10]  Walter F. Tichy,et al.  Implementation and evaluation of a revision control system , 1982 .

[11]  Alfred V. Aho,et al.  Data Structures and Algorithms , 1983 .

[12]  Richard M. Karp,et al.  Efficient Randomized Pattern-Matching Algorithms , 1987, IBM J. Res. Dev..

[13]  David B. Leblang,et al.  Computer-Aided Software Engineering in a distributed workstation environment , 1984 .

[14]  Peter K. Pearson,et al.  Fast hashing of variable-length text strings , 1990, CACM.

[15]  Wolfgang Obst,et al.  Delta Technique and String-to-String Correction , 1987, ESEC.

[16]  C. Reichenberger Orthogonal version management , 1989 .