A Linear Ensemble of Individual and Blended Models for Music Rating Prediction

Track 1 of KDDCup 2011 aims at predicting the rating behavior of users in the Yahoo! Music system. At National Taiwan University, we organize a course that teams up students to work on both tracks of KDDCup 2011. For track 1, we first tackle the problem by building variants of existing individual models, including Matrix Factorization, Restricted Boltzmann Machine, k-Nearest Neighbors, Probabilistic Latent Semantic Analysis, Probabilistic Principle Component Analysis and Supervised Regression. We then blend the individual models along with some carefully extracted features in a non-linear manner. A large linear ensemble that contains both the individual and the blended models is learned and taken through some post-processing steps to form the final solution. The four stages: individual model building, non-linear blending, linear ensemble and post-processing lead to a successful final solution, within which techniques on feature engineering and aggregation (blending and ensemble learning) play crucial roles. Our team is the first prize winner of both tracks of KDD Cup 2011.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Paola Campadelli,et al.  A Boosting Algorithm for Regression , 1997, ICANN.

[3]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[4]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[5]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[6]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[9]  D.P. Solomatine,et al.  AdaBoost.RT: a boosting algorithm for regression problems , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[10]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.

[11]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[12]  M. Jahrer The BigChaos Solution to the Netix Prize 2008 , 2008 .

[13]  Ling Li,et al.  Support Vector Machinery for Infinite Ensemble Learning , 2008, J. Mach. Learn. Res..

[14]  Robert M. Bell,et al.  The BellKor 2008 Solution to the Netflix Prize , 2008 .

[15]  Alexander J. Smola,et al.  Improving maximum margin matrix factorization , 2008, Machine Learning.

[16]  Ma Chih-Chao Large-scale Collaborative Filtering Algorithms , 2008 .

[17]  Andreas Töscher The BigChaos Solution to the Netflix Prize 2008 , 2008 .

[18]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[19]  Neil D. Lawrence,et al.  Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[20]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[21]  Shou-De Lin,et al.  Feature Engineering and Classifier Ensemble for KDD Cup 2010 , 2010, KDD 2010.

[22]  M. I. Jordan Leo Breiman , 2011, 1101.0929.

[23]  Gilles Louppe,et al.  Collaborative filtering: Scalable approaches using restricted Boltzmann machines , 2010 .

[24]  Yehuda Koren,et al.  Advances in Collaborative Filtering , 2011, Recommender Systems Handbook.

[25]  Kilian Q. Weinberger,et al.  Web-Search Ranking with Initialized Gradient Boosted Regression Trees , 2010, Yahoo! Learning to Rank Challenge.

[26]  Yehuda Koren,et al.  The Yahoo! Music Dataset and KDD-Cup '11 , 2012, KDD Cup.

[27]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[28]  Hsuan-Tien Lin,et al.  Improving ranking performance with cost-sensitive ordinal classification via regression , 2013, Information Retrieval.

[29]  Vladimir Vovk,et al.  Kernel Ridge Regression , 2013, Empirical Inference.