Integration of Data Sources through Data Mining

• Source databases can be organized by using several different models, such as the relational model, the object-oriented model, or semistructured models (e.g., XML). • Information stored in a single table in one relational database can be stored in two or more tables in another. This problem is common when source databases show different levels of normalization and also occurs in nonrelational sources. • A single field in one database, such as Name, could correspond to multiple fields, such as First Name and Last Name, in another.

[1]  Philip Calvert,et al.  Encyclopedia of Data Warehousing and Mining , 2006 .

[2]  Jean-Marc Petit,et al.  Efficient Algorithms for Mining Inclusion Dependencies , 2002, EDBT.

[3]  Salvatore J. Stolfo,et al.  Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.

[4]  John Wang,et al.  Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications , 2008 .

[5]  Silvana Castano,et al.  Global Viewing of Heterogeneous Data Sources , 2001, IEEE Trans. Knowl. Data Eng..

[6]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[7]  David Taniar,et al.  Progressive Methods in Data Warehousing and Business Intelligence: Concepts and Competitive Analytics , 2009 .

[8]  Alberto Salguero,et al.  Methodology for Improving Data Warehouse Design using Data Sources Temporal Metadata , 2009 .

[9]  Elke A. Rundensteiner,et al.  Discovery of high-dimensional inclusion dependencies , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[10]  Jung-Hwan Oh Video Data Mining , 2009, Encyclopedia of Data Warehousing and Mining.

[11]  Jeffrey F. Naughton,et al.  On schema matching with opaque column names and data values , 2003, SIGMOD '03.

[12]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[13]  Ana Isabel Canhoto Ontology-Based Interpretation and Validation of Mined Knowledge: Normative and Cognitive Factors in Data Mining , 2008 .

[14]  A Min Tjoa,et al.  A Framework for Efficient Association Rule Mining in XML Data , 2006, J. Database Manag..

[15]  Chris Clifton,et al.  SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks , 2000, Data Knowl. Eng..

[16]  Theodore Johnson,et al.  Mining database structure; or, how to build a data quality browser , 2002, SIGMOD '02.

[17]  Matthias Jarke,et al.  Systematic Development of Data Mining-Based Data Quality Tools , 2003, VLDB.