Stream-based semi-supervised learning for recommender systems

To alleviate the problem of data sparsity inherent to recommender systems, we propose a semi-supervised framework for stream-based recommendations. Our framework uses abundant unlabelled information to improve the quality of recommendations. We extend a state-of-the-art matrix factorization algorithm by the ability to add new dimensions to the matrix at runtime and implement two approaches to semi-supervised learning: co-training and self-learning. We introduce a new evaluation protocol including statistical testing and parameter optimization. We then evaluate our framework on five real-world datasets in a stream setting. On all of the datasets our method achieves statistically significant improvements in the quality of recommendations.

[1]  Lars Schmidt-Thieme,et al.  Semi-supervised Tag Recommendation - Using Untagged Resources to Mitigate Cold-Start Problems , 2010, PAKDD.

[2]  João Gama,et al.  Online Reliability Estimates for Individual Predictions in Data Streams , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[3]  Andreas Stafylopatis,et al.  A Movie Recommender System Based on Semi-supervised Clustering , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[4]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[5]  Seiji Yamada,et al.  A Movie Recommender System Based on , 2004 .

[6]  Jie Tang,et al.  Addressing cold start in recommender systems: a semi-supervised co-training algorithm , 2014, SIGIR.

[7]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[8]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[9]  Robi Polikar,et al.  COMPOSE: A Semisupervised Learning Framework for Initially Labeled Nonstationary Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[10]  George Karypis,et al.  A Comprehensive Survey of Neighborhood-based Recommendation Methods , 2011, Recommender Systems Handbook.

[11]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[12]  Yaroslav O. Halchenko,et al.  Open is Not Enough. Let's Take the Next Step: An Integrated, Community-Driven Computing Platform for Neuroscience , 2012, Front. Neuroinform..

[13]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[14]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[15]  P. Massa,et al.  Trust-aware Bootstrapping of Recommender Systems , 2006 .

[16]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[17]  João Gama,et al.  Data Stream Classification Guided by Clustering on Nonstationary Environments and Extreme Verification Latency , 2015, SDM.

[18]  G. Karypis,et al.  Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems , 2002 .

[19]  B. Hu,et al.  Semi-Supervised Learning for Personalized Web Recommender System , 2010, Comput. Informatics.

[20]  Myra Spiliopoulou,et al.  Forgetting methods for incremental matrix factorization in recommender systems , 2015, SAC.

[21]  Myra Spiliopoulou,et al.  Semi-supervised Learning for Stream Recommender Systems , 2015, Discovery Science.

[22]  Myra Spiliopoulou,et al.  Selective Forgetting for Incremental Matrix Factorization in Recommender Systems , 2014, Discovery Science.

[23]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[24]  Q. Mcnemar Note on the sampling error of the difference between correlated proportions or percentages , 1947, Psychometrika.

[25]  Qiang Yang,et al.  Semi-Supervised Learning with Very Few Labeled Training Examples , 2007, AAAI.

[26]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[27]  João Gama,et al.  Enhancing data stream predictions with reliability estimators and explanation , 2014, Eng. Appl. Artif. Intell..

[28]  Myra Spiliopoulou,et al.  Hoeffding-CF: Neighbourhood-Based Recommendations on Reliably Similar Users , 2014, UMAP.

[29]  Zhi-Hua Zhou,et al.  Semisupervised Regression with Cotraining-Style Algorithms , 2007, IEEE Transactions on Knowledge and Data Engineering.

[30]  Domonkos Tikk,et al.  Scalable Collaborative Filtering Approaches for Large Recommender Systems , 2009, J. Mach. Learn. Res..

[31]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[32]  Fernando Ortega,et al.  Incorporating reliability measurements into the predictions of a recommender system , 2013, Inf. Sci..

[33]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[34]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[35]  Yehuda Koren,et al.  Collaborative filtering with temporal dynamics , 2009, KDD.

[36]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[37]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.