Integration of XML Data

Various XML instances from different data sources can model the same object of the real world. Query processing or view definition over these sources demands instance integration. In this context, integration means to identify which data instances represent the same object of the real world, as well as to solve ambiguities of representation of this object. The entity identification problem in XML is more complex than in structured databases. XML data, as originally considered, necessarily do not have the identification notion of primary key or object identifier. Thus, it is necessary the adoption of a mechanism that identifies the instances at the moment of data integration. This paper presents a proposal for identifiers attribution to XML instances, based on the use of Skolem functions and XPath recommendation, as proposed by W3C.

[1]  Shashi Shekhar,et al.  Resolving attribute incompatibility in database integration: an evidential reasoning approach , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[2]  Ioana Manolescu,et al.  Agora: Living with XML and Relational , 2000, VLDB.

[3]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[4]  Arbee L. P. Chen,et al.  A probabilistic approach to query processing in heterogeneous database systems , 1992, [1992 Proceedings] Second International Workshop on Research Issues on Data Engineering: Transaction and Query Processing.

[5]  Stuart E. Madnick,et al.  The inter-database instance identification problem in integrating autonomous systems , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[6]  Martin Gogolla,et al.  Identifying Objects by Declarative Queries , 2000, Advances in Object-Oriented Data Modeling.

[7]  Deise de Brum Saccol Materialização de visões XML , 2001 .

[8]  Yannis Papakonstantinou,et al.  Object Fusion in Mediator Systems , 1996, VLDB.

[9]  Michael Kifer,et al.  Querying object-oriented databases , 1992, SIGMOD '92.

[10]  Arie Segev,et al.  Data manipulation in heterogeneous databases , 1991, SGMD.

[11]  Jennifer Widom,et al.  The WHIPS prototype for data warehouse creation and maintenance , 1997, SIGMOD '97.

[12]  Weimin Du,et al.  The Pegasus heterogeneous multidatabase system , 1991, Computer.

[13]  Ee-Peng Lim,et al.  A Global Object Model for Accommodating Instance Heterogeneities , 1998, ER.

[14]  Roger King,et al.  Using Object Matching and Materialization to Integrate Heterogeneous Databases , 1999, CoopIS.

[15]  Joseph Albert,et al.  Data integration in the RODIN multidatabase system , 1996, Proceedings First IFCIS International Conference on Cooperative Information Systems.

[16]  Jaideep Srivastava,et al.  Entity identification in database integration , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[17]  Amar Gupta,et al.  A Methodology for Integration of Heterogeneous Databases , 1994, IEEE Trans. Knowl. Data Eng..

[18]  Peter M. Schwarz,et al.  The Rufus System: Information Organization for Semi-Structured Data , 1993, VLDB.

[19]  LINDA G. DEMICHIEL,et al.  Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains , 1989, IEEE Trans. Knowl. Data Eng..

[20]  Masatoshi Yoshikawa,et al.  ILOG: Declarative Creation and Manipulation of Object Identifiers , 1990, VLDB.

[21]  James L. Hein,et al.  Discrete structures, logic, and computability , 1994 .

[22]  Umeshwar Dayal,et al.  Processing Queries Over Generalization Hierarchies in a Multidatabase System , 1983, VLDB.