Schema matching is a critical problem for integrating heterogeneous information sources. Traditionally, the problem of matching multiple schemas has essentially relied on finding pairwise-attribute correspondence. In contrast, we propose a new matching paradigm, holistic schema matching, to holistically match many schemas at the same time and find all the matchings at once. By handling a set of schemas together, we can explore their context information that reflects the semantic correspondences among attributes, which is not available when schemas are matched only in pairs. As the realizations of the holistic paradigm, we developed two alternative approaches recently. This article takes an initial step to unify those two approaches and further contrasts their strength and weakness. Specifically, we develop two alternative methods for realizing holistic schema matching: lobal evaluationand local evaluation. Global evaluation exhaustively assesses all the possible models, where a modelexpresses all attribute matchings. In particular, we propose the MGS framework for such global evaluation with the hypothesis of the existence of generative models. On the other hand, local evaluation independently assesses every single matching to incrementally construct the model. In particular, we develop theDCM framework for such local evaluation with the observation that co-occurrence patterns across schemas often reveal the complex relationships of attributes. We apply our approaches on matching Web query interfaces on the deep Web. The result shows the effectiveness of both the MGS andDCM approaches, which together demonstrate the promise of the holistic paradigm for schema matching.
[1]
Mitesh Patel,et al.
Structured databases on the web: observations and implications
,
2004,
SGMD.
[2]
Peter J. Bickel.
Mathematical stastistics : basic ideas and selected topics / Peter J. Bickel, Kjell A. Doksum
,
2001
.
[3]
Erhard Rahm,et al.
Generic Schema Matching with Cupid
,
2001,
VLDB.
[4]
Erhard Rahm,et al.
A survey of approaches to automatic schema matching
,
2001,
The VLDB Journal.
[5]
Shamkant B. Navathe,et al.
A Methodology for View Inegration in Logical Database Design
,
1982,
VLDB.
[6]
P. Bickel,et al.
Mathematical Statistics: Basic Ideas and Selected Topics
,
1977
.
[7]
AnHai Doan,et al.
iMAP: Discovering Complex Mappings between Database Schemas.
,
2004,
SIGMOD 2004.
[8]
Jiawei Han,et al.
Discovering complex matchings across web query interfaces: a correlation mining approach
,
2004,
KDD.
[9]
Kevin Chen-Chuan Chang,et al.
Statistical schema matching across web query interfaces
,
2003,
SIGMOD '03.
[10]
Kevin Chen-Chuan Chang,et al.
Automatic complex schema matching across Web query interfaces: A correlation mining approach
,
2006,
TODS.