High performance data mining using data cubes on parallel computers

Online analytical processing techniques are used for data analysis and decision support systems. The multidimensionality of the underlying data is well represented by multidimensional databases. For data mining in knowledge discovery, OLAP calculations can be effectively used. For these, high performance parallel systems are required to provide interactive analysis. Precomputed aggregate calculations in a data cube can provide efficient query processing for OLAP applications. We present parallel data cube construction on distributed-memory parallel computers from a relational database. The data cube is used for data mining of associations using attribute focusing. Results are presented for these on the IBM-SP2, which show that our algorithms and techniques are scalable to a large number of processors, providing a high performance platform for such applications.

[1]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[2]  Inderpal S. Bhandari,et al.  A Case Study of Software Process Improvement During Development , 1993, IEEE Trans. Software Eng..

[3]  Michael Stonebraker,et al.  Efficient organization of large multidimensional arrays , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[4]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[5]  E. F. Codd,et al.  Providing OLAP to User-Analysts: An IT Mandate , 1998 .

[6]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[7]  Jiawei Han,et al.  Data-Driven Discovery of Quantitative Rules in Relational Databases , 1993, IEEE Trans. Knowl. Data Eng..

[8]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[9]  Alok N. Choudhary,et al.  Parallel data cube construction for high performance on-line analytical processing , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[10]  Sunita Sarawagi,et al.  On computing the data cube , 1996 .