论文信息 - NORMS: An automatic tool to perform schema label normalization

NORMS: An automatic tool to perform schema label normalization

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources (heterogeneous in format and structure). Schema matching systems usually exploit lexical and semantic information provided by lexical databases/thesauri to discover intra/inter semantic relationships among schema elements. However, most of them obtain poor performance on real world scenarios due to the significant presence of “non-dictionary words”. Non-dictionary words include compound nouns, abbreviations and acronyms. In this paper, we present NORMS (NORMalizer of Schemata), a tool performing schema label normalization to increase the number of comparable labels extracted from schemata1.

Sonia Bergamaschi | Serena Sorrentino | Maciej Gawinecki

[1] Erhard Rahm,et al. A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[2] Judith N. Levi,et al. The syntax and semantics of complex nominals , 1978 .

[3] Sonia Bergamaschi,et al. Schema Normalization for Improving Schema Matching , 2009, ER.

[4] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[5] Arnon Rosenthal,et al. Analyzing and revising data integration schemas to improve their matchability , 2008, Proc. VLDB Endow..

[6] Sonia Bergamaschi,et al. Automatic annotation for mapping discovery in data integration systems , 2008, SEBD.

[7] Erhard Rahm,et al. Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[8] Bodo Rieger,et al. Semantic Integration of Heterogeneous Information Sources , 2000, EFIS.

[9] Erhard Rahm,et al. Generic Schema Matching with Cupid , 2001, VLDB.

[10] Ehud Gudes,et al. Abbreviation Expansion in Schema Matching and Web Integration , 2004, IEEE/WIC/ACM International Conference on Web Intelligence (WI'04).