A Proposal of High Performance Data Mining System

In recent years, new decision support system (DSS) based on the technologies of data warehouse, data mining and on-line analytical processing appeared. As the accumulated amount of data becomes enormous too much, the data quantitative problem, the data qualitative problem and the data presentation problem occur in data mining in large-scale databases and data warehouses. An effective way to enhance the power and flexibility of data mining in data warehouses and large databases is to integrate data mining with OLAP in DSS. Parallel and distributed processing are also two important components of successful large-scale data mining applications. In this paper, a high performance data mining scheme is proposed. The overall architecture and the mechanism of the system are described.

[1]  Judy E. Scott Organizational knowledge and the Intranet , 1998, Decis. Support Syst..

[2]  W. H. Inmon,et al.  Building the data warehouse , 1992 .

[3]  Masato Oguchi,et al.  Using available remote memory dynamically for parallel data mining application on ATM-connected PC cluster , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[4]  Alok N. Choudhary,et al.  PARSIMONY: An Infrastructure for Parallel Multidimensional Analysis and Data Mining , 2001, J. Parallel Distributed Comput..

[5]  Z. Liu,et al.  A proposal of integrating data mining and on-line analytical processing in data warehouse , 2001, 2001 International Conferences on Info-Tech and Info-Net. Proceedings (Cat. No.01EX479).

[6]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[7]  H.S. Lopes,et al.  A parallel genetic algorithm for rule discovery in large databases , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[8]  Mohammed J. Zaki Parallel Sequence Mining on Shared-Memory Machines , 1999, Large-Scale Parallel Data Mining.

[9]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[10]  W. H. Inmon,et al.  Corporate Information Factory , 1998 .

[11]  James L. McKenney,et al.  Management decision systems : computer-based support for decision making , 1971 .

[12]  Alex Alves Freitas,et al.  Incorporating Deviation-Detection Functionality into the OLAP Paradigm , 2001, SBBD.

[13]  Celia C. Bojarczuk,et al.  Genetic programming for knowledge discovery in chest-pain diagnosis. , 2000, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[14]  S. Choy,et al.  Distributed Object Technology , 2001 .

[15]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[16]  Alex A. Freitas,et al.  A framework for data-parallel knowledge discovery in databases , 1996, KDD 1996.

[17]  Steven L. Alter,et al.  Decision support systems : current practice and continuing challenges , 1980 .

[18]  Qing Chen,et al.  Mining Exceptions And Quantitative Association Rules In Olap Data Cube , 1999 .

[19]  James F. Courtney,et al.  A conceptual architecture for generalized decision support system software , 1984, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  José A. B. Fortes,et al.  Characterization and Parallelization of Decision-Tree Induction , 2001, J. Parallel Distributed Comput..

[21]  Claudia Imhoff,et al.  Exploration Warehousing , 2000 .

[22]  Jane Fedorowicz,et al.  Representing modeling knowledge in an intelligent decision support system , 1986, Decision Support Systems.

[23]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[24]  Masato Oguchi,et al.  Data mining on PC cluster connected with storage area network: its preliminary experimental results , 2001, ICC 2001. IEEE International Conference on Communications. Conference Record (Cat. No.01CH37240).

[25]  Lawrence B. Holder,et al.  Approaches to Parallel Graph-Based Knowledge Discovery , 2001, J. Parallel Distributed Comput..