Evaluation of fourteen desktop data mining tools

Fourteen desktop data mining tools (or tool modules) ranging in price from US$75 to $25,000 (median <$1,000) were evaluated by four undergraduates inexperienced at data mining, a relatively experienced graduate student, and a professional data mining consultant. The tools ran under the Microsoft Windows 95, Microsoft Windows NT, or Macintosh System 7.5 operating systems, and employed decision trees, rule induction, neural networks, or polynomial networks to solve two binary classification problems, a multi-class classification problem, and a noiseless estimation problem. Twenty evaluation criteria and a standardized procedure for assessing tool qualities were developed and applied. The traits were collected in five categories: capability, learnability/usability, interoperability, flexibility, and accuracy. Performance in each of these categories was rated on a six-point ordinal scale, to summarize their relative strengths and weaknesses. This paper summarizes a lengthy technical report (Gomolka et al., 1998), which details the evaluation procedure and the scoring of all component criteria. This information should be useful to analysts selecting data mining tools to employ, as well as to developers aiming to produce better data mining products.

[1]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Richard S. Johannes,et al.  Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus , 1988 .

[3]  Donald E. Brown,et al.  Induction and polynomial networks , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[4]  IV JohnF.Elder,et al.  Heuristic Search for Model Structure: the Benefits of Restraining Greed , 1995, AISTATS.

[5]  John Elder,et al.  Heuristic Search for Model Structure: the Benefits of Restraining Greed , 1996 .