论文信息 - Towards a Cost-Effective Parallel Data Mining Approach

Towards a Cost-Effective Parallel Data Mining Approach

Massive rule induction has recently emerged as one of the powerful data mining techniques. The problem is known to be exponential in the size of the attributes, and given its ever increasing use, can greatly benefit from parallelization. In this paper, we study cost-effective approaches to parallelize rule generation algorithms. In particular, we consider the propositional rule generation algorithm of the Discovery Board system, and present our design and implementation of a parallel algorithm for the same task. We then present some early performance results of our parallelization scheme on hardware and software distributed shared memory multiprocessors.

Liviu Iftode | Aashu Virmani | Zoltan Jarai

[1] Kai Li,et al. Shared virtual memory on loosely coupled multiprocessors , 1986 .

[2] Liviu Iftode,et al. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems , 1996, OSDI '96.

[3] Angelos Bilas,et al. The Effects of Communication Parameters on End Performance of Shared Virtual Memory Clusters , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[4] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[5] Rajeev Motwani,et al. Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[6] Tomasz Imielinski,et al. DataMine: Application Programming Interface and Query Language for Database Mining , 1996, KDD.

[7] Ramakrishnan Srikant,et al. Fast algorithms for mining association rules , 1998, VLDB 1998.

[8] Shamkant B. Navathe,et al. An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.