Categorization And Evaluation Of Data MiningTechniques

A fundamental issue in the application of data mining algorithms to solve problems of real life is to know ahead of the time the usability of the algorithm for the class of problems being considered. In other words, we would like to know, before starting the KDD process for a particular problem P, with its features belonging to a type Cj of problems or tasks, how well a specific data mining algorithm Aj would perform in solving P. In this paper, we survey the main approaches to categorize and evaluate data mining techniques. This will help to clarify the relationship that can exist between a particular data mining algorithm and the type of tasks or problems for which it is best suited. Perhaps the most important conclusion we show is that no single technique provides the best performance for all types of tasks, and that a multi-strategy approach is needed to deal with real complex problems. Categorizing data mining techniques will guide the user, prior the start of the KDD process or during the data mining phase, in the selection of the best subset of techniques to resolve a problem or data mining task. Transactions on Information and Communications Technologies vol 19 © 1998 WIT Press, www.witpress.com, ISSN 1743-3517