A Distributed Algorithm Based on Competitive Neural Network for Mining Frequent Patterns

Although FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the apriori algorithm, it is unrealistic to construct memory-based FP-tree when dataset is huge, because the FP-tree is too great to be held in memory entirely. In this study, we propose a novel method named competitive-network-based FP-growth method (CNFP), which combines competitive neural network with FP-growth to mine frequent patterns. In competitive learning, similar patterns are grouped by the network and represented by a single neuron. This grouping is done automatically based on data correlations. Huge database is divided into sets of similar data. After competitive learning, neurons in competitive layer are regarded as root to construct FP-sub-trees, in which transactions are similar to each other. Frequent patterns are mined based on FP-sub-tree to decompose the mining task into a set of smaller tasks, which dramatically reduces the search space. CNFP frequent patterns on Web log files and discover association rules between URL pages users access. Not only can it help us to discover the user access patterns effectively, but to provide the valid decision-making for the Web master to devise the personalized Web site. Our experiments on a large real data set show that the approach is efficient and practical for mining association rules on Website pages

[1]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[2]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[3]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[5]  Jiawei Han,et al.  A fast distributed algorithm for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[6]  Yongjian Fu,et al.  A Framework for Personal Web Usage Mining , 2002, International Conference on Internet Computing.

[7]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[8]  Ran Wolff,et al.  Communication-efficient distributed mining of association rules , 2001, SIGMOD '01.

[9]  Chunhua Ju,et al.  Reorganizing web sites based on user access patterns , 2002, Intell. Syst. Account. Finance Manag..

[10]  Sunita Sarawagi,et al.  Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications , 1998, SIGMOD '98.

[11]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[12]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[13]  Philip S. Yu,et al.  Efficient parallel data mining for association rules , 1995, CIKM '95.

[14]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.