Unsupervised Meta-Learning for Clustering Algorithm Recommendation

In this work, the goal is to use clustering algorithms as recommender in a meta-learning system and, thus, to propose an unsupervised meta-learning approach. Meta-learning has been successfully used for recommendation of Machine Learning algorithms in several Data Mining tasks. Meta-learning can rank algorithms according to their adequacy for a new dataset and use this ranking to recommend algorithms. The recommendations are usually made by predictive meta-models induced by supervised Machine Learning techniques, therefore needing a target attribute. In many situations, the target attribute is not available or has a high computational cost. In these situations, the use of unsupervised meta-models (as clustering algorithms) could provide important insights from Machine Learning experiments, like the interpretation of the partitions found by these clustering algorithms. Here, clustering algorithms are used as unsupervised meta-models. Experimental results show that the proposed approach achieved better clustering quality.

[1]  J. Friedman Stochastic gradient boosting , 2002 .

[2]  Carlos Soares,et al.  A Meta-Learning Method to Select the Kernel Width in Support Vector Regression , 2004, Machine Learning.

[3]  Alexandros Kalousis,et al.  Algorithm selection via meta-learning , 2002 .

[4]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[5]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[6]  Alexander Schliep,et al.  Ranking and selecting clustering algorithms using a meta-learning approach , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[7]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Noise detection in the meta-learning level , 2016, Neurocomputing.

[8]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A new data characterization for selecting clustering algorithms using meta-learning , 2019, Inf. Sci..

[9]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[10]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Metalearning and Recommender Systems: A literature review and empirical study on the algorithm selection problem for Collaborative Filtering , 2018, Inf. Sci..

[11]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[12]  Quan Sun,et al.  Pairwise meta-rules for better meta-learning-based algorithm ranking , 2013, Machine Learning.

[13]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[16]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[17]  Jeffrey S. Simonoff,et al.  Analyzing Categorical Data , 2003 .

[18]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[19]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[20]  L. Hubert,et al.  Comparing partitions , 1985 .

[21]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[22]  P. Brazdil,et al.  Analysis of results , 1995 .

[23]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Meta-learning to select the best meta-heuristic for the Traveling Salesman Problem: A comparison of meta-features , 2016, Neurocomputing.

[24]  Siddheswar Ray,et al.  Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[25]  Boris Delibasic,et al.  Extending meta-learning framework for clustering gene expression data with component-based algorithm design and internal evaluation measures , 2016, Int. J. Data Min. Bioinform..

[26]  Fabricio A. Breve,et al.  Particle Competition and Cooperation in Networks for Semi-Supervised Learning , 2012, IEEE Transactions on Knowledge and Data Engineering.

[27]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[28]  Renata M. C. R. de Souza,et al.  A multivariate fuzzy c-means method , 2013, Appl. Soft Comput..

[29]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[30]  Michalis Vazirgiannis,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On Clustering Validation Techniques , 2022 .

[31]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[32]  L. Hubert,et al.  Measuring the Power of Hierarchical Cluster Analysis , 1975 .

[33]  Teresa Bernarda Ludermir,et al.  Meta-learning approaches to selecting time series models , 2004, Neurocomputing.

[34]  Marcílio Carlos Pereira de Souto,et al.  Selecting Machine Learning Algorithms Using the Ranking Meta-Learning Approach , 2011, Meta-Learning in Computational Intelligence.