A metric for unsupervised metalearning

We argue the value of unsupervised metalearning and discuss the attendant necessity of suitable similarity, or distance, functions. We leverage the notion of diversity among learners used in ensemble learning to design a distance function for the clustering of learning algorithms. We revisit the most popular measures of diversity and show that only one of them, Classifier Output Difference COD is a metric. We then use COD to produce a clustering of 21 learning algorithms, and show how this clustering differs from a clustering based on accuracy, and how it can be used to highlight interesting, sometimes unexpected, similarities among learning algorithms.

[1]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[2]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[3]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[4]  Eugeniusz Gatnar A Diversity Measure for Tree-Based Classifier Ensembles , 2005, Data Analysis and Decision Support.

[5]  M. Pazzani,et al.  Error Reduction through Learning Multiple Descriptions , 1996, Machine Learning.

[6]  Victor Ciesielski,et al.  Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map , 2001, HIS.

[7]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[8]  Kate Smith-Miles,et al.  On learning algorithm selection for classification , 2006, Appl. Soft Comput..

[9]  Xiaozhe Wang,et al.  Characteristic-Based Clustering for Time Series Data , 2006, Data Mining and Knowledge Discovery.

[10]  David H. Wolpert Any Two Learning Algorithms Are (Almost) Exactly Identical , 2000 .

[11]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[12]  Kate Smith-Miles,et al.  Cross-disciplinary perspectives on meta-learning for algorithm selection , 2009, CSUR.

[13]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[14]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[15]  Ian Witten,et al.  Data Mining , 2000 .

[16]  Anand M. Narasimhamurthy Evaluation of Diversity Measures for Binary Classifier Ensembles , 2005, Multiple Classifier Systems.

[17]  R. Mark Sirkin,et al.  Statistics for the Social Sciences , 1994 .

[18]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[19]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[20]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[21]  João Gama,et al.  On Data and Algorithms: Understanding Inductive Performance , 2004, Machine Learning.

[22]  Christophe G. Giraud-Carrier,et al.  New Insights into Learning Algorithms and Datasets , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[23]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[24]  K. Bailey Methods of Social Research , 1978 .

[25]  Teresa Bernarda Ludermir,et al.  Active Selection of Training Examples for Meta-Learning , 2007, 7th International Conference on Hybrid Intelligent Systems (HIS 2007).

[26]  Chuan Yi Tang,et al.  On the Relationships Among Various Diversity Measures in Multiple Classifier Systems , 2008, 2008 International Symposium on Parallel Architectures, Algorithms, and Networks (i-span 2008).

[27]  C. J. Whitaker,et al.  Ten measures of diversity in classifier ensembles: limits for two classifiers , 2001 .

[28]  Fabio Roli,et al.  Design of effective neural network ensembles for image classification purposes , 2001, Image Vis. Comput..

[29]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.