Change detection in hierarchically structured information

Detecting and representing changes to data is important for active databases, data warehousing, view maintenance, and version and configuration management. Most previous work in change management has dealt with flat-file and relational data; we focus on hierarchically structured data. Since in many cases changes must be computed from old and new versions of the data, we define the hierarchical change detection problem as the problem of finding a "minimum-cost edit script" that transforms one data tree to another, and we present efficient algorithms for computing such an edit script. Our algorithms make use of some key domain characteristics to achieve substantially better performance than previous, general-purpose algorithms. We study the performance of our algorithms both analytically and empirically, and we describe the application of our techniques to hierarchically structured documents.

[1]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[2]  V. Rich Personal communication , 1989, Nature.

[3]  Kaizhong Zhang,et al.  Fast Algorithms for the Unit Cost Editing Distance Between Trees , 1990, J. Algorithms.

[4]  Shahram Ghandeharizadeh,et al.  On Implementing a Language for Specifying Active Database Execution Models , 1993, VLDB.

[5]  W. H. Inmon,et al.  Rdb/VMS: Developing the Data Warehouse , 1993 .

[6]  Kincho H. Law,et al.  Versions, Configurations, and Constraints in CEDB , 1994 .

[7]  Jennifer Widom,et al.  Active Database Systems: Triggers and Rules For Advanced Database Processing , 1994 .

[8]  Inderpal Singh Mumick,et al.  Maintenance of Materialized Views: Problems, Techniques, and Applications , 1999, IEEE Data Eng. Bull..

[9]  Jason Tsong-Li Wang,et al.  Pattern matching and pattern discovery in scientific, program, and document databases , 1995, SIGMOD '95.

[10]  H. Garcia-Molina,et al.  The Stanford Data Warehousing Project , 1995, IEEE Data Eng. Bull..

[11]  T. Milo,et al.  A Database Interface for File Updates , 1995, SIGMOD Conference.

[12]  Serge Abiteboul,et al.  A database interface for file update , 1995, SIGMOD '95.

[13]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[14]  Jennifer Widom,et al.  View maintenance in a warehousing environment , 1995, SIGMOD '95.

[15]  H. Garcia-Molina,et al.  Change Detection in Hierarchically Structured Information , 1996, SIGMOD Conference.