Meta-Learning Based Framework for Helping Non-expert Miners to Choice a Suitable Classification Algorithm: An Application for the Educational Field

One of the most challenging tasks in the knowledge discovery process is the selection of the best classification algorithm for a data set at hand. Thus, tools which help practitioners to choose the best classifier along with its parameter setting are highly demanded. These will not only be useful for trainees but also for the automation of the data mining process. Our approach is based on meta-learning, which relies on the application of learning algorithms on meta-data extracted from data mining experiments in order to better understand how these algorithms can become flexible in solving different kinds of learning problems. This paper presents a framework which allows novices to create and feed their own experiment database and later, analyse and select the best technique for their target data set. As case study, we evaluate different sets of meta-features on educational data sets and discuss which ones are more suitable for predicting student performance.

[1]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[2]  Luís Torgo,et al.  OpenML: A Collaborative Science Platform , 2013, ECML/PKDD.

[3]  Andreas Dengel,et al.  Automatic classifier selection for non-experts , 2012, Pattern Analysis and Applications.

[4]  Andreas Dengel,et al.  Predicting Classifier Combinations , 2013, ICPRAM.

[5]  John R. Rice,et al.  The Algorithm Selection Problem , 1976, Adv. Comput..

[6]  Sebastián Ventura,et al.  Data mining in education , 2013, WIREs Data Mining Knowl. Discov..

[7]  Peter A. Flach,et al.  Improved Dataset Characterisation for Meta-learning , 2002, Discovery Science.

[8]  Marta E. Zorrilla,et al.  Meta-learning: Can It Be Suitable to Automatise the KDD Process for the Educational Domain? , 2014, RSEISP.

[9]  Sebastián Ventura,et al.  Meta-learning Approach for Automatic Parameter Tuning: A case of study with educational datasets , 2012, EDM.

[10]  Pavel Kordík,et al.  On performance of meta-learning templates on different datasets , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[11]  Tin Kam Ho,et al.  Geometrical Complexity of Classification Problems , 2004, ArXiv.

[12]  Marta E. Zorrilla,et al.  Development of a Knowledge Base for Enabling Non-expert Users to Apply Data Mining Algorithms , 2013, SIMPDA.

[13]  M. Hilario,et al.  Building algorithm profiles for prior model selection in knowledge discovery systems , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[14]  D. Wolpert The Supervised Learning No-Free-Lunch Theorems , 2002 .

[15]  Geoff Holmes,et al.  Experiment databases , 2012, Machine Learning.

[16]  Melanie Hilario,et al.  Model selection via meta-learning: a comparative study , 2000, Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000.

[17]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[18]  George D. C. Cavalcanti,et al.  Data Complexity Measures and Nearest Neighbor Classifiers: A Practical Analysis for Meta-learning , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[19]  María N. Moreno García,et al.  Information-Theoretic Measures for Meta-learning , 2008, HAIS.

[20]  Sebastián Ventura,et al.  A meta-learning approach for recommending a subset of white-box classification algorithms for Moodle datasets , 2013, EDM.

[21]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.