A State-of-the-Art Review of Knowledge Discovery in Multiple Databases

Abstract Knowledge discovery in multiple databases offers many opportunities and challenges. We have given a number of motivating points on knowledge discovery in multiple databases. In view of further studies on this aspect, we highlight some domains that generated numerous problems on multiple related databases. Activities related to data preprocessing in a multi-database mining environment are also discussed. Important techniques of mining multiple databases are outlined. Many interesting patterns that originated out of multi-database environments are highlighted. We shall witness more research outcomes and investigations as the number of multi-database domains is on the rise.

[1]  Saso Dzeroski,et al.  Multi-relational data mining 2004: workshop report , 2004, SKDD.

[2]  Saso Dzeroski,et al.  Multi-relational data mining: an introduction , 2003, SKDD.

[3]  Jhimli Adhikari,et al.  Mining Multiple Large Data Sources , 2010, Int. Arab J. Inf. Technol..

[4]  Xindong Wu,et al.  Multi-Database Mining , 2003, IEEE Intell. Informatics Bull..

[5]  Wolfgang Bibel,et al.  Solving Constraint Optimization Problems from CLP-Style Specifications Using Heuristic Search Techniques , 2002, IEEE Trans. Knowl. Data Eng..

[6]  Zoran Obradovic,et al.  Knowledge Discovery in Multiple Spatial Databases , 2002, Neural Computing & Applications.

[7]  Dan Zhang,et al.  TidFP: Mining Frequent Patterns in Different Databases with Transaction ID , 2009, DaWaK.

[8]  Animesh Adhikari,et al.  Efficient clustering of databases induced by local patterns , 2008, Decis. Support Syst..

[9]  Guo-Cheng Lana,et al.  A Novel Algorithm for Mining Rare-Utility Itemsets in a Multi-Database Environment , 2009 .

[10]  Jhimli Adhikari,et al.  Measuring Influence of an Item in Time-Stamped Databases , 2015 .

[11]  Peter A. Flach Multi-relational Data Mining: a perspective , 2001, EPIA.

[12]  Hongjun Lu,et al.  Toward Multidatabase Mining: Identifying Relevant Databases , 2001, IEEE Trans. Knowl. Data Eng..

[13]  Tijl De Bie,et al.  Interesting pattern mining in multi-relational data , 2013, Data Mining and Knowledge Discovery.

[14]  Shichao Zhang,et al.  Mining Multiple Data Sources: Local Pattern Analysis , 2006, Data Mining and Knowledge Discovery.

[15]  Wei Wang,et al.  Sequential Pattern Mining in Multi-Databases via Multiple Alignment , 2006, Data Mining and Knowledge Discovery.

[16]  Bongki Moon,et al.  Efficient Algorithms for Large-Scale Temporal Aggregation , 2003, IEEE Trans. Knowl. Data Eng..

[17]  Khiat Salim,et al.  Probabilistic Models for Local Patterns Analysis , 2014 .

[18]  Larry Kerschberg,et al.  Knowledge Discovery from Multiple Databases , 1995, KDD.

[19]  Jhimli Adhikari,et al.  Clustering items in different data sources induced by stability , 2009, Int. Arab J. Inf. Technol..

[20]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[21]  Jiawei Han,et al.  Geographic Data Mining and Knowledge Discovery , 2001 .

[22]  Bruce G. Buchanan,et al.  The WoRLD: Knowledge Discovery from Multiple Distributed Databases , 2007 .

[23]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[24]  Witold Pedrycz,et al.  Mining Icebergs in Time-Stamped Databases , 2011, IICAI.

[25]  Ning Zhong,et al.  Discovering Concept Clusters by Decomposing Databases , 1994, Data Knowl. Eng..

[26]  Shichao. Zhang Knowledge discovery in multi-databases by analyzing local instances , 2002 .

[27]  Luc De Raedt,et al.  Multi-relational data mining: a workshop report , 2002, SKDD.

[28]  Hafida Belbachir,et al.  Probabilistic Models for Local Patterns Analysis , 2014, J. Inf. Process. Syst..

[29]  Xindong Wu,et al.  Synthesizing High-Frequency Rules from Different Data Sources , 2003, IEEE Trans. Knowl. Data Eng..

[30]  Xindong Wu,et al.  Discovering Relational Patterns across Multiple Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[31]  Luc De Raedt,et al.  Multirelational data mining 2003: workshop report , 2003, SKDD.

[32]  Yiyu Yao,et al.  Peculiarity Oriented Multidatabase Mining , 2003, IEEE Trans. Knowl. Data Eng..

[33]  Qiang Yang,et al.  Mining Adaptive Ratio Rules from Distributed Data Sources , 2006, Data Mining and Knowledge Discovery.

[34]  Leonidas J. Guibas,et al.  Wireless sensor networks - an information processing approach , 2004, The Morgan Kaufmann series in networking.

[35]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[36]  Barry Wilkinson Grid Computing: Techniques and Applications , 2009 .

[37]  Jhimli Adhikari,et al.  Identifying Calendar-Based Periodic Patterns , 2013 .

[38]  Hongjun Lu Seamless Integration of Data Mining with DBMS and Applications , 2001, PAKDD.

[39]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[40]  Witold Pedrycz,et al.  Developing Multi-Database Mining Applications , 2010, Advanced Information and Knowledge Processing.

[41]  Animesh Adhikari Knowledge discovery in databases with an emphasis on multiple large database , 2008 .

[42]  Pedro M. Domingos Prospects and challenges for multi-relational data mining , 2003, SKDD.

[43]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[44]  Marie-Christine Fauvet,et al.  Handling temporal grouping and pattern-matching queries in a temporal object model , 1998, CIKM '98.

[45]  Eyke Hüllermeier,et al.  Open challenges for data stream mining research , 2014, SKDD.

[46]  Wen-Chih Peng,et al.  Mining sequential patterns across multiple sequence databases , 2009, Data Knowl. Eng..

[47]  Isaac S. Kohane,et al.  Application of Information Technology: Temporal Expressiveness in Querying a Time-stamp - based Clinical Database , 2000, J. Am. Medical Informatics Assoc..

[48]  Kun Liu,et al.  Privacy Sensitive Distributed Data Mining from Multi-party Data , 2003, ISI.

[49]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[50]  Animesh Adhikari,et al.  Enhancing quality of knowledge synthesized from multi-database mining , 2007, Pattern Recognit. Lett..

[51]  Jhimli Adhikari,et al.  Mining Patterns of Select Items in Different Data Sources , 2015 .

[52]  Alexander S. Szalay,et al.  Petabyte Scale Data Mining: Dream or Reality? , 2002, SPIE Astronomical Telescopes + Instrumentation.

[53]  Grigorios Tsoumakas,et al.  Distributed Data Mining , 2009, Encyclopedia of Data Warehousing and Mining.

[54]  Valentin Pupezescu,et al.  ADVANCES IN KNOWLEDGE DISCOVERY IN DATABASES , 2008 .

[55]  Le Gruenwald,et al.  Research issues in mining multiple data streams , 2010, StreamKDD '10.

[56]  Myra Spiliopoulou,et al.  On exploiting the power of time in data mining , 2008, SKDD.

[57]  Jhimli Adhikari,et al.  Mining Patterns in Different Related Databases , 2015 .

[58]  Witold Pedrycz,et al.  Synthesizing Global Exceptional Patterns in Different Data Sources , 2012 .

[59]  Xindong Wu,et al.  Mining globally interesting patterns from multiple databases using kernel estimation , 2009, Expert Syst. Appl..

[60]  Xindong Wu,et al.  Bridging Local and Global Data Cleansing: Identifying Class Noise in Large, Distributed Data Datasets , 2006, Data Mining and Knowledge Discovery.

[61]  Animesh Adhikari,et al.  Synthesizing heavy association rules from different real data sources , 2008, Pattern Recognit. Lett..

[62]  Jhimli Adhikari,et al.  Mining and Analysis of Time-stamped Databases , 2012 .

[63]  Hillol Kargupta,et al.  Distributed Data Mining: Algorithms, Systems, and Applications , 2003 .

[64]  Xindong Wu,et al.  Database classification for multi-database mining , 2005, Inf. Syst..

[65]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.