Incremental Schema Mapping

Schema mapping that provides a unified view to the users is essential to manage schema heterogeneity among different sources. Schema mapping can be conducted by machine learning or by knowledge engineering approach. Machine learning approach needs training data set for building models, but usually it is very difficult to obtain training datasets for large datasets. In addition, it is very difficult to change the model by human knowledge. Knowledge engineering approach encodes human knowledge directly, such that the knowledge base can be constructed with limited data, but it needs time consuming knowledge acquisition. This research proposes an incremental schema mapping method that employs Ripple-Down Rules (RDR) with the censored production rules (CPR). Our experimental results show that RDR approach shows comparable performance with the machine learning approaches and RDR knowledge base can be expanded incrementally as the cases classified increase.

[1]  P. Compton,et al.  A philosophical basis for knowledge acquisition , 1990 .

[2]  Phokion G. Kolaitis,et al.  Learning schema mappings , 2012, ICDT '12.

[3]  Egor V. Kostylev,et al.  Combining dependent annotations for relational algebra , 2012, ICDT '12.

[4]  R. Malor,et al.  Ripple down rules: possibilities and limitations , 2010 .

[5]  Wei Cheng,et al.  An Efficient Schema Matching Algorithm , 2005, KES.

[6]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[7]  Zohra Bellahsene,et al.  Opening the Black Box of Ontology Matching , 2013, ESWC.

[8]  Debbie Richards,et al.  Two decades of Ripple Down Rules research , 2009, The Knowledge Engineering Review.

[9]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[10]  Pedro M. Domingos,et al.  Learning to map between ontologies on the semantic web , 2002, WWW '02.

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[13]  David W. Embley,et al.  Automatic direct and indirect schema mapping: experiences and lessons learned , 2004, SGMD.

[14]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[15]  Fabio A. González,et al.  Generalized Mongue-Elkan Method for Approximate Text String Comparison , 2009, CICLing.

[16]  Alexander Gelbukh,et al.  Computational Linguistics and Intelligent Text Processing , 2015, Lecture Notes in Computer Science.

[17]  Eibe Frank,et al.  Combining Naive Bayes and Decision Tables , 2008, FLAIRS.

[18]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[19]  Enrico Motta,et al.  The Semantic Web - ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings , 2005, SEMWEB.

[20]  Gustavo Alonso,et al.  TRAMP: Understanding the Behavior of Schema Mappings through Provenance , 2010, Proc. VLDB Endow..

[21]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[22]  Byeong Ho Kang,et al.  Multiple Classification Ripple Down Rules : Evaluation and Possibilities , 2000 .

[23]  Avigdor Gal,et al.  Boosting Schema Matchers , 2008, OTM Conferences.

[24]  Lakhmi C. Jain,et al.  Knowledge-Based Intelligent Information and Engineering Systems , 2004, Lecture Notes in Computer Science.

[25]  Erhard Rahm,et al.  Schema and ontology matching with COMA++ , 2005, SIGMOD '05.

[26]  Zohra Bellahsene,et al.  YAM: a schema matcher factory , 2009, CIKM.

[27]  Byeong Ho Kang,et al.  Ripple-Down Rules with Censored Production Rules , 2012, PKAW.

[28]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[29]  Oscar Corcho,et al.  The Semantic Web: Semantics and Big Data , 2013, Lecture Notes in Computer Science.

[30]  Man Lung Yiu,et al.  Group-by skyline query processing in relational engines , 2009, CIKM.

[31]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.