The hybrid integration system Towards a new approach for creating candidate views for materialization

The vulgarization of information technologies and telecommunications has generated an enormous amount of information. This information is generally heterogeneous, stored in autonomous and distributed sources. Thus, it becomes necessary to introduce the information integration systems. These systems must ensure an optimal query response time, and the freshness of data. Using a virtual approach cannot answer these questions. On the one hand, the query response time is very important. Indeed, the mediator must access, every time, to the sources for load the relevant information. On the other hand, the sources are not always available. The establishment of a hybrid integration system, where a portion of information is materialized in the mediator and the other portion remains in the sources and are extracted at query time, is an effective solution to these problem, provided that the materialized part has carefully chosen. Based on the distribution of user queries, we present in this paper an approach to select the information most requested by users and organize it as candidate views for materialization in the mediator.

[1]  Divesh Srivastava,et al.  The Information Manifold , 1995 .

[2]  Jennifer Widom,et al.  Integrating heterogeneous databases: lazy or eager? , 1996, CSUR.

[3]  Craig A. Knoblock,et al.  Optimizing information mediators by selectively materializing data , 2000 .

[4]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[5]  Craig A. Knoblock,et al.  Selectively materializing data in mediators by analyzing user queries , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[6]  Craig A. Knoblock,et al.  Retrieving and Integrating Data from Multiple Information Sources , 1993, Int. J. Cooperative Inf. Syst..

[7]  Agnès Voisard,et al.  Geospatial Information Extraction: Querying or Quarrying? , 1999 .

[8]  Roy Goldman,et al.  Lore: a database management system for semistructured data , 1997, SGMD.

[9]  Joann J. Ordille,et al.  Query-Answering Algorithms for Information Agents , 1996, AAAI/IAAI, Vol. 1.

[10]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[11]  Shokoh Kermanshahani,et al.  IXIA (IndeX-based Integration Approach) A Hybrid Approach to Data Integration , 2009 .

[12]  Gang Zhou,et al.  A framework for supporting data integration using the materialized and virtual approaches , 1996, SIGMOD '96.

[13]  Dimitrios Gunopulos,et al.  Architecture and Implementation of an XQuery-based Information Integration Platform. , 2002 .

[14]  Sophie Cluet,et al.  Querying XML Documents in Xyleme , 2000, SIGIR 2000.

[15]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[16]  Laura M. Haas,et al.  Beauty and the Beast: The Theory and Practice of Information Integration , 2007, ICDT.

[17]  Richard H. Lathrop,et al.  Heterogeneous Biomedical Database Integration using a Hybrid Strategy: A P53 Cancer Research Database , 2006 .