A Latent Semantic Indexing-Based Approach to Determine Similar Clusters in Large-scale Schema Matching

Schema matching plays a central role in identifying the semantic correspondences across shared-data applications, such as data integration. Due to the increasing size and the widespread use of XML schemas and different kinds of ontologies, it becomes toughly challenging to cope with large-scale schema matching. Clustering-based matching is a great step towards more significant reduction of the search space and thus improved efficiency. However, methods used to identify similar clusters depend on literally matching terms. To improve this situation, in this paper, a new approach is proposed which uses Latent Semantic Indexing that allows retrieving the conceptual meaning between clusters. The experimental evaluations show encourage results towards building efficient large-scale matching approaches.

[1]  Eric Peukert,et al.  Comparing Similarity Combination Methods for Schema Matching , 2010, GI Jahrestagung.

[2]  Erhard Rahm,et al.  Towards Large-Scale Schema and Ontology Matching , 2011, Schema Matching and Mapping.

[3]  Gunter Saake,et al.  Improving XML schema matching performance using Prüfer sequences , 2009, Data Knowl. Eng..

[4]  Yuzhong Qu,et al.  Matching large ontologies: A divide-and-conquer approach , 2008, Data Knowl. Eng..

[5]  Angela Bonifati,et al.  Schema mapping verification: the spicy way , 2008, EDBT '08.

[6]  Masaki Aono,et al.  An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size , 2009, J. Web Semant..

[7]  Elizabeth R. Jessup,et al.  Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..

[8]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[9]  Shensheng Zhang,et al.  Matching Large Scale Ontology Effectively , 2006, ASWC.

[10]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[11]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[12]  Erhard Rahm,et al.  Matching large schemas: Approaches and evaluation , 2007, Inf. Syst..

[13]  Erhard Rahm,et al.  A Clustering-Based Approach for Large-Scale Ontology Matching , 2011, ADBIS.

[14]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[15]  Thomas K. Landauer,et al.  Latent Semantic Analysis , 2006 .