Uncertainty in Data Integration

Data integration has been an important area of research for several years. In this chapter, we argue that supporting modern data integration applications requires systems to handle uncertainty at every step of integration. We provide a formal framework for data integration systems with uncertainty. We define probabilistic schema mappings and probabilistic mediated schemas, show how they can be constructed automatically for a set of data sources, and provide techniques for query answering. The foundations laid out in this chapter enable bootstrapping a pay-as-you-go integration system completely automatically.

[1]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[2]  Jayant Madhavan,et al.  Google's Deep Web crawl , 2008, Proc. VLDB Endow..

[3]  Alon Y. Halevy,et al.  Using Probabilistic Information in Data Integration , 1997, VLDB.

[4]  Qian Ying Discovering Complex Semantic Matches Between Database Schemas , 2008 .

[5]  Alon Y. Halevy,et al.  Enterprise information integration: successes, challenges and controversies , 2005, SIGMOD '05.

[6]  Umberto Straccia,et al.  Information retrieval and machine learning for probabilistic schema matching , 2005, CIKM '05.

[7]  Joann J. Ordille,et al.  Data integration: the teenage years , 2006, VLDB.

[8]  David Maier,et al.  Principles of dataspace systems , 2006, PODS '06.

[9]  Jayant Madhavan,et al.  Web-Scale Data Integration: You can afford to Pay as You Go , 2007, CIDR.

[10]  Matteo Magnani,et al.  Schema Integration Based on Uncertain Semantic Mappings , 2005, ER.

[11]  Renée J. Miller,et al.  The Use of Information Capacity in Schema Integration and Translation , 1993, VLDB.

[12]  Avigdor Gal,et al.  Automatic Ontology Matching Using Application Semantics , 2005, AI Mag..

[13]  Alon Y. Halevy,et al.  A Platform for Personal Information Management and Integration , 2005, CIDR.

[14]  Wei-Ying Ma,et al.  Instance-based Schema Matching for Web Databases by Domain-specific Query Probing , 2004, VLDB.

[15]  Phokion G. Kolaitis,et al.  Interactive generation of integrated schemas , 2008, SIGMOD Conference.

[16]  Avigdor Gal,et al.  Why is schema matching tough and what can we do about it? , 2006, SGMD.