Development of an Enhanced Generic Data Mining Life Cycle (DMLC)

Data mining projects are complex and have a high failure rate. In order to improve project management and success rates of such projects a life cycle is vital to the overall success of the project. This paper reports on a research project that was concerned with the life cycle development for large scale data mining projects. The paper provides a detailed view of the design and development of a generic data mining life cycle called DMLC. The life cycle aims to support all members of data mining project teams as well as IT managers and academic researchers and may improve project success rates and strategic decision support. An extensive analysis of eight existing life cycles leads to a list of advantages, disadvantages, and characteristics of the life cycles. This is extended and generates a conglomerate of several guidelines which serve as the foundation for the development of a new generic data mining life cycle. The new life cycle is further developed to incorporate process, people and data aspects. A detailed study of the human resources involved in a data mining project enhances the DMLC.

[1]  Brendan Tierney,et al.  The involvement of human resources in large scale data mining projects , 2003, ISICT.

[2]  Larry Kerschberg,et al.  A methodology and life cycle model for data mining and knowledge discovery in precision agriculture , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[3]  Shashi Shekhar,et al.  Visual Data Mining: Framework and Algorithm Development , 2007 .

[4]  W. Klösgen Data mining tasks and methods: Subgroup discovery: deviation analysis , 2002 .

[5]  Mary Lou Maher,et al.  Analysing participation in collaborative design environments , 2000 .

[6]  Willi Klösgen,et al.  Types and forms of data , 2002 .

[7]  Thomas H. Davenport,et al.  Book review:Working knowledge: How organizations manage what they know. Thomas H. Davenport and Laurence Prusak. Harvard Business School Press, 1998. $29.95US. ISBN 0‐87584‐655‐6 , 1998 .

[8]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[9]  W. H. Inmon,et al.  Building the Data Warehouse,3rd Edition , 2002 .

[10]  Padhraic Smyth,et al.  Knowledge Discovery and Data Mining: Towards a Unifying Framework , 1996, KDD.

[11]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit , 2009 .

[12]  W. H. Inmon,et al.  Building the data warehouse (2nd ed.) , 1996 .

[13]  Efraim Turban,et al.  Decision support systems and intelligent systems , 1997 .

[14]  Ioannis Kopanakis,et al.  Visual data mining modeling techniques for the visualization of mining outcomes , 2003, J. Vis. Lang. Comput..