Data Mining Processes and Collaboration Principles

Data mining is a process involving the application of human skill as well as technology, and as such it can be supported by clearly defined processes and procedures. This chapter presents the CRISP-DM process, one well developed standard data mining process, which contains clearly defined phases with clearly defined steps and deliverables. The nature of some of the CRISP-DM phases is such that it is possible to perform them in an e-collaboration setting. The principles for extending the CRISP-DM process to support collaborative data mining are described in the RAMSYS approach to data mining. The tools, systems, and evaluation procedures that are required for the RAMSYS approach to reach its potential are described.