Comparing approaches to prepare data in classification problems

This paper presents a comparison between DMPML and three data mining applications (Weka, RapidMiner, and KN-IME) that implement the directed graph approach, concerning the time spent to create and execute the data preparation tasks for two data mining algorithms. The tests were executed using different types of data sets: numerical, categorical, and mixed. We observed that the scheme used by the DMPML framework can simplify the usage of different data mining algorithms and reduce the time spent creating the data preparation tasks.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Thorsten Meinl,et al.  KNIME - the Konstanz information miner: version 2.0 and beyond , 2009, SKDD.

[3]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[4]  Roberto Souto Maior de Barros,et al.  DMPML Data Mining Preparation Markup Language , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.