Modified algorithms for synthesizing high-frequency rules from different data sources

Because of the rapid growth in information and communication technologies, a company’s data may be spread over several continents. For an effective decision-making process, knowledge workers need data, which may be geographically spread in different locations. In such circumstances, multi-database mining plays a major role in the process of extracting knowledge from different data sources. In this paper, we have proposed a new methodology for synthesizing high-frequency rules from different data sources, where data source weight has been calculated on the basis of their transaction population. We have also proposed a new method for calculating global confidence. Our goal in synthesizing local patterns to obtain global patterns is that, the support and confidence of synthesized patterns must be very nearly same if all the databases are integrated and mono-mining has been done. Experiments conducted clearly establish that the proposed method of synthesizing high-frequency rules fairly meets the stipulation.

[1]  Yiyu Yao,et al.  Peculiarity Oriented Multidatabase Mining , 2003, IEEE Trans. Knowl. Data Eng..

[2]  Robert L. Grossman,et al.  A Framework for Finding Distributed Data Mining Strategies That are Intermediate Between Centralized , 2000 .

[3]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[4]  Hongjun Lu,et al.  Toward Multidatabase Mining: Identifying Relevant Databases , 2001, IEEE Trans. Knowl. Data Eng..

[5]  Laure Berti-Équille,et al.  Data quality awareness: a case study for cost optimal association rule mining , 2007, Knowledge and Information Systems.

[6]  Xindong Wu,et al.  Database classification for multi-database mining , 2005, Inf. Syst..

[7]  Xindong Wu Knowledge Discovery in Multiple Databases , 2004, ICTAI.

[8]  Shichao Zhang,et al.  Mining Multiple Data Sources: Local Pattern Analysis , 2006, Data Mining and Knowledge Discovery.

[9]  Korris Fu-Lai Chung,et al.  Knowledge and Information Systems , 2017 .

[10]  Xindong Wu,et al.  A Decremental Algorithm for Maintaining Frequent Itemsets in Dynamic Databases , 2005, DaWaK.

[11]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[12]  Fabrice Guillet,et al.  Interactive visual exploration of association rules with rule-focusing methodology , 2007, Knowledge and Information Systems.

[13]  Chengqi Zhang,et al.  Identifying Global Exceptional Patterns in Multi-database Mining , 2004, IEEE Intell. Informatics Bull..

[14]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[15]  Xindong Wu,et al.  Synthesizing High-Frequency Rules from Different Data Sources , 2003, IEEE Trans. Knowl. Data Eng..

[16]  L. Stein,et al.  Probability and the Weighing of Evidence , 1950 .

[17]  Xindong Wu,et al.  Multi-Database Mining , 2003, IEEE Intell. Informatics Bull..