DWMiner: A Tool for Mining Frequent Item Sets Efficiently in Data Warehouses

This work presents DWMiner, an association rules efficient mining tool to process data directly over a relational DBMS data warehouse. DWMiner executes the Apriori algorithm as SQL queries in parallel, using a database PC Cluster middleware developed for SQL query optimization in OLAP applications. DWMiner combines intra- and inter-query parallelism in order to reduce the total time needed to find frequent item sets directly from a data warehouse. DWMiner was tested using the BMS-Web-View1 database from KDD-Cup 2000 and obtained linear and super-linear speedups.

[1]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[2]  Marco Danelutto,et al.  Euro-Par 2004 Parallel Processing , 2004, Lecture Notes in Computer Science.

[3]  Arun N. Swami,et al.  Set-oriented mining for association rules in relational databases , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[4]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[5]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[6]  Marta Mattoso,et al.  ParGRES: a middleware for executing OLAP queries in parallel , 2005 .

[7]  Carla E. Brodley,et al.  KDD-Cup 2000 organizers' report: peeling the onion , 2000, SKDD.

[8]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[9]  M.A.W. Houtsma,et al.  Set-Oriented Mining for Association Rules , 1993, ICDE 1993.

[10]  Heikki Mannila,et al.  Verkamo: Fast Discovery of Association Rules , 1996, KDD 1996.

[11]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[12]  Sunita Sarawagi,et al.  Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications , 1998, SIGMOD '98.

[13]  Fuat Akal,et al.  OLAP Query Evaluation in a Database Cluster: A Performance Study on Intra-Query Parallelism , 2002, ADBIS.

[14]  Marta Mattoso,et al.  Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster , 2004, J. Inf. Data Manag..

[15]  Marta Mattoso,et al.  OLAP Query Processing in a Database Cluster , 2004, Euro-Par.

[16]  Heiko Schuldt,et al.  FAS - A Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components , 2002, VLDB.