Ad-hoc association-rule mining within the data warehouse

Many organizations often underutilize their existing data warehouses. In this paper, we suggest a way of acquiring more information from corporate data warehouses without the complications and drawbacks of deploying additional software systems. Association-rule mining, which captures co-occurrence patterns within data, has attracted considerable efforts from data warehousing researchers and practitioners alike. Unfortunately, most data mining tools are loosely coupled, at best, with the data warehouse repository. Furthermore, these tools can often find association rules only within the main fact table of the data warehouse (thus ignoring the information-rich dimensions of the star schema) and are not easily applied on non-transaction level data often found in data warehouses. In this paper, we present a new data-mining framework that is tightly integrated with the data warehousing technology. Our framework has several advantages over the use of separate data mining tools. First, the data stays at the data warehouse, and thus the management of security and privacy issues is greatly reduced. Second, we utilize the query processing power of a data warehouse itself, without using a separate data-mining tool. In addition, this framework allows ad-hoc data mining queries over the whole data warehouse, not just over a transformed portion of the data that is required when a standard data-mining tool is used. Finally, this framework also expands the domain of association-rule mining from transaction-level data to aggregated data as well.

[1]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[2]  Barbara Wixom,et al.  Current Practices in Data Warehousing , 2001, Inf. Syst. Manag..

[3]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[4]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[5]  Neal Leavitt,et al.  Data Mining for the Corporate Masses? , 2002, Computer.

[6]  Henk Sol,et al.  Proceedings of the 54th Hawaii International Conference on System Sciences , 1997, HICSS 2015.

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  Ke Wang,et al.  Mining Frequent Itemsets Using Support Constraints , 2000, VLDB.

[9]  W. H. Inmon,et al.  Building the data warehouse , 1992 .

[10]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[11]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[12]  Nick Cercone,et al.  Mining Association Rules from Market Basket Data using Share Measures and Characterized Itemsets , 1998, Int. J. Artif. Intell. Tools.

[13]  Sunita Sarawagi,et al.  Integrating Mining with Relational Database Systems: Alternatives and Implications. , 1998, SIGMOD 1998.

[14]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.