XCC: change control of XML documents

XML-based documents play a major role in modern information architectures and their corresponding workflows. In this context, the ability to identify and represent differences between two versions of a document is essential, as well as the merging of document versions resulting from parallel editing processes.Different approaches try to meet these challenges using operational transformation or document annotation. In both approaches, the changes are tracked during editing, which requires corresponding editing applications. In the context of software development, however, a state-based approach is common. Here, versions are compared and merged using external tools, called diff and patch. This allows the users for editing documents without being tightened to editing tools. Approaches exist that are able to compare XML documents, but lack a corresponding merge capability.In this article, we present a comprehensive framework that allows for comparing and merging of XML documents using a state-based approach. Its design is based on an analysis of XML documents and their modification patterns. The framework is built on top of a context-oriented delta model. We present a diff algorithm that appears to be highly efficient in terms of speed and delta quality. The patch algorithm is able to merge document versions efficiently and reliably. The efficiency and the reliability of our approach are verified using a competitive test scenario.

[1]  Jennifer Widom,et al.  Change detection in hierarchically structured information , 1996, SIGMOD '96.

[2]  not Cwi,et al.  XHTML™ 1.0 The Extensible HyperText Markup Language , 2002 .

[3]  Uwe M. Borghoff,et al.  Merging changes in XML documents using reliable context fingerprints , 2008, ACM Symposium on Document Engineering.

[4]  Jorge Martínez Gil,et al.  Managing Branch Versioning in Versioned/Temporal XML Documents , 2007, XSym.

[5]  Kaizhong Zhang Efficient Parallel Algorithms for Tree Editing Problems , 1996, CPM.

[6]  Chengzheng Sun,et al.  Operational transformation in real-time group editors: issues, algorithms, and achievements , 1998, CSCW '98.

[7]  Sanjeev Khanna,et al.  A Formal Investigation of , 2007, FSTTCS.

[8]  Eiichi Tanaka,et al.  The Tree-to-Tree Editing Problem , 1988, Int. J. Pattern Recognit. Artif. Intell..

[9]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[10]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[11]  Marc Najork,et al.  A large‐scale study of the evolution of Web pages , 2004, Softw. Pract. Exp..

[12]  Rok Sosic,et al.  Optimal locking integrated with operational transformation in distributed real-time group editors , 1999, PODC '99.

[13]  Uwe M. Borghoff,et al.  Efficient change control of XML documents , 2009, DocEng '09.

[14]  Hector Garcia-Molina,et al.  Meaningful change detection in structured data , 1997, SIGMOD '97.

[15]  Hala Skaf-Molli,et al.  SAMS: synchronous, asynchronous, multi-synchronous environments , 2002, The 7th International Conference on Computer Supported Cooperative Work in Design.

[16]  Uwe M. Borghoff,et al.  Towards XML version control of office documents , 2005, DocEng '05.

[17]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[18]  Claudia-Lavinia Ignat,et al.  Peer-to-peer collaboration over XML documents , 2008, CDVE.

[19]  Serge Abiteboul,et al.  Detecting changes in XML documents , 2002, Proceedings 18th International Conference on Data Engineering.

[20]  L. Bergroth,et al.  A survey of longest common subsequence algorithms , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[21]  Katja Hose,et al.  An extended transaction model for cooperative authoring of XML data , 2009, Computer Science - Research and Development.

[22]  Christopher Olston,et al.  What's new on the web?: the evolution of the web from a search engine perspective , 2004, WWW '04.

[23]  Tancred Lindholm,et al.  A three-way merge for XML documents , 2004, DocEng '04.

[24]  Susan T. Dumais,et al.  The web changes everything: understanding the dynamics of web content , 2009, WSDM '09.

[25]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[26]  Sasu Tarkoma,et al.  Fast and simple XML tree differencing by sequence alignment , 2006, DocEng '06.

[27]  Tom Mens,et al.  A State-of-the-Art Survey on Software Merging , 2002, IEEE Trans. Software Eng..

[28]  Robin La Fontaine,et al.  Merging XML files: a new approach providing intelligent merge of XML data sets , 2002 .

[29]  Weimin Chen,et al.  New Algorithm for Ordered Tree-to-Tree Correction Problem , 2001, J. Algorithms.

[30]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[31]  Uwe M. Borghoff,et al.  Versioning XML-based office documents , 2009, Multimedia Tools and Applications.