The Incremental Mining of Constrained Cube Gradients

The mining of cube gradients is an extension of traditional association rules mining in data cube and has broad applications. In this paper, we consider the problem of mining constrained cube gradients for partially materialized data cubes. Its purpose is to extract interesting gradient-probe cell pairs from partially materialized cubes while adding or deleting cells. Instead of directly searching the new data cubes from scratch, an incremental mining algorithm IncA is presented, which sufficiently uses the mined cube gradients from old data cubes. In our algorithms, the condensed cube structure is used to reduce the sizes of materialized cubes. Moreover, some efficient methods are presented in IncA to optimize the comparison process of cell pairs. The performance studies show the incremental mining algorithm IncA is more efficient and scalable than the directed mining algorithm DA with different constraints and sizes of materialized data cubes.

[1]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[2]  Jeffrey F. Naughton,et al.  An array-based algorithm for simultaneous multidimensional aggregates , 1997, SIGMOD '97.

[3]  Leonid Khachiyan,et al.  Cubegrades: Generalizing Association Rules , 2002, Data Mining and Knowledge Discovery.

[4]  Soongoo Hong,et al.  Usage and Perceived Impact of Data Warehouses: a Study in Korean Financial Companies , 2006, Int. J. Inf. Technol. Decis. Mak..

[5]  Giuseppe Psaila,et al.  Active Data Mining , 1995, Encyclopedia of GIS.

[6]  Nick Roussopoulos,et al.  Cubetree: organization of and bulk incremental updates on the data cube , 1997, SIGMOD '97.

[7]  Deok-Kyun Yun,et al.  Clustering Categorical and Numerical Data: A New Procedure Using Multidimensional Scaling , 2003, Int. J. Inf. Technol. Decis. Mak..

[8]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[9]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[10]  Kenneth A. Ross,et al.  Fast Computation of Sparse Datacubes , 1997, VLDB.

[11]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[12]  Jian Pei,et al.  Efficient computation of Iceberg cubes with complex measures , 2001, SIGMOD '01.

[13]  Ming-Syan Chen,et al.  Developing Data Allocation Schemes by Incremental Mining of User Moving Patterns in a Mobile Computing System , 2003, IEEE Trans. Knowl. Data Eng..

[14]  Sunita Sarawagi,et al.  Intelligent Rollups in Multidimensional OLAP Data , 2001, VLDB.

[15]  Yannis Sismanis,et al.  Dwarf: shrinking the PetaCube , 2002, SIGMOD '02.

[16]  Yi Peng,et al.  Data Mining via Multiple Criteria Linear Programming: Applications in Credit Card Portfolio Management , 2002, Int. J. Inf. Technol. Decis. Mak..

[17]  Jian Yin,et al.  The mining of fuzzy gradient in the case of materialized cube , 2005, The Fifth International Conference on Computer and Information Technology (CIT'05).

[18]  Yu-Bao Liu,et al.  Mining constrained cube gradient for materialized cube , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[19]  Raghu Ramakrishnan,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999, SIGMOD '99.

[20]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[21]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[22]  Laks V. S. Lakshmanan,et al.  QC-trees: an efficient summary structure for semantic OLAP , 2003, SIGMOD '03.

[23]  Basilis Boutsinas,et al.  Accessing Data Mining Rules through Expert Systems , 2002, Int. J. Inf. Technol. Decis. Mak..

[24]  Jeffrey F. Naughton,et al.  On the Computation of Multidimensional Aggregates , 1996, VLDB.

[25]  Laks V. S. Lakshmanan,et al.  Quotient Cube: How to Summarize the Semantics of a Data Cube , 2002, VLDB.

[26]  Andrew Rau-Chaplin,et al.  Computing Partial Data Cubes for Parallel Data Warehousing Applications , 2001, PVM/MPI.

[27]  Themis Palpanas,et al.  Knowledge discovery in data warehouses , 2000, SGMD.

[28]  Nimrod Megiddo,et al.  Discovery-Driven Exploration of OLAP Data Cubes , 1998, EDBT.

[29]  Jianlin Feng,et al.  Indexing and incremental updating condensed data cube , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[30]  Yong Shi,et al.  Classifications Of Credit Cardholder Behavior By Using Fuzzy Linear Programming , 2004, Int. J. Inf. Technol. Decis. Mak..

[31]  Hongjun Lu,et al.  Condensed cube: an effective approach to reducing data cube size , 2002, Proceedings 18th International Conference on Data Engineering.

[32]  Jian Pei,et al.  Mining Multi-Dimensional Constrained Gradients in Data Cubes , 2001, VLDB.

[33]  Walid G. Aref,et al.  Incremental, online, and merge mining of partial periodic patterns in time-series databases , 2004, IEEE Transactions on Knowledge and Data Engineering.