论文信息 - Learning Heterogeneous Similarity Measures for Hybrid-Recommendations in Meta-Mining

Learning Heterogeneous Similarity Measures for Hybrid-Recommendations in Meta-Mining

The notion of meta-mining has appeared recently and extends traditional meta-learning in two ways. First it provides support for the whole data-mining process. Second it pries open the so called algorithm black-box approach where algorithms and workflows also have descriptors. With the availability of descriptors both for datasets and data-mining workflows we are faced with a problem the nature of which is much more similar to those appearing in recommendation systems. In order to account for the meta-mining specificities we derive a novel metric-based-learning recommender approach. Our method learns two homogeneous metrics, one in the dataset and one in the workflow space, and a heterogeneous one in the dataset-workflow space. All learned metrics reflect similarities established from the dataset-workflow preference matrix. The latter is constructed from the performance results obtained by the application of workflows to datasets. We demonstrate our method on meta-mining over biological (microarray datasets) problems. The application of our method is not limited to the meta-mining problem, its formulation is general enough so that it can be applied on problems with similar requirements.

[1] Melanie Hilario,et al. Fusion of Meta-knowledge and Meta-data for Case-Based Model Selection , 2001, PKDD.

[2] Taghi M. Khoshgoftaar,et al. A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[3] Charles C. Taylor,et al. Meta-Analysis: From Data Characterisation for Meta-Learning to Meta-Regression , 2000 .

[4] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[5] Leo Breiman,et al. Classification and Regression Trees , 1984 .

[6] Cao Feng,et al. STATLOG: COMPARISON OF CLASSIFICATION ALGORITHMS ON LARGE REAL-WORLD PROBLEMS , 1995 .

[7] Thore Graepel,et al. Matchbox: large scale online bayesian recommendations , 2009, WWW '09.

[8] Deepak Agarwal,et al. fLDA: matrix factorization through latent dirichlet allocation , 2010, WSDM '10.

[9] 金田重郎,et al. C4.5: Programs for Machine Learning (書評) , 1995 .

[10] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[11] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[12] Ricardo Vilalta,et al. Introduction to the Special Issue on Meta-Learning , 2004, Machine Learning.

[13] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[14] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[15] Robin D. Burke,et al. Hybrid Recommender Systems: Survey and Experiments , 2002, User Modeling and User-Adapted Interaction.

[16] Marko Robnik-Sikonja,et al. Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[17] João Gama,et al. On Data and Algorithms: Understanding Inductive Performance , 2004, Machine Learning.

[18] Ricardo Vilalta,et al. Using Meta-Learning to Support Data Mining , 2004, Int. J. Comput. Sci. Appl..

[19] Kate Smith-Miles,et al. Cross-disciplinary perspectives on meta-learning for algorithm selection , 2009, CSUR.

[20] T. Ho,et al. Data Complexity in Pattern Recognition , 2006 .

[21] Aixia Guo,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[22] Carlos Soares,et al. Zoomed Ranking: Selection of Classification Algorithms Based on Relevant Performance Information , 2000, PKDD.

[23] Melanie Hilario,et al. Ontology-Based Meta-Mining of Knowledge Discovery Workflows , 2011, Meta-Learning in Computational Intelligence.

[24] Pat Langley,et al. Induction of One-Level Decision Trees , 1992, ML.

[25] Alexandros Kalousis,et al. NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection , 1999, Intell. Data Anal..

[26] Mohammed J. Zaki. Efficiently mining frequent trees in a forest: algorithms and applications , 2005, IEEE Transactions on Knowledge and Data Engineering.

[27] Ingo Mierswa,et al. YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[28] Ricardo Vilalta,et al. Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[29] Hilan Bensusan,et al. Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[30] Tim Oates,et al. A Review of Recent Research in Metareasoning and Metalearning , 2007, AI Mag..

[31] Deepak Agarwal,et al. Regression-based latent factor models , 2009, KDD.