Active Matrix Completion

Recovering a matrix from a sampling of its entries is a problem of rapidly growing interest and has been studied under the name of matrix completion. It occurs in many areas of engineering and applied science. In most machine learning and data mining applications, it is possible to leverage the expertise of human oracles to improve the performance of the system. It is therefore natural to extend this idea of "human-in-the-loop" to the matrix completion problem. However, considering the enormity of data in the modern era, manually completing all the entries in a matrix will be an expensive process in terms of time, labor and human expertise, human oracles can only provide selective supervision to guide the solution process. Thus, appropriately identifying a subset of missing entries (for manual annotation) in an incomplete matrix is of paramount practical importance, this can potentially lead to better reconstructions of the incomplete matrix with minimal human effort. In this paper, we propose novel algorithms to address this issue. Since the query locations are actively selected by the algorithms, we refer to these methods as active matrix completion algorithms. The proposed techniques are generic and the same frameworks can be used in a wide variety of applications including recommendation systems, transductive / multi-label active learning, active learning in regression and active feature acquisition among others. Our extensive empirical analysis on several challenging real-world datasets certify the merit and versatility of the proposed frameworks in efficiently exploiting human intelligence in data mining / machine learning applications.

[1]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[2]  Yiming Yang,et al.  Personalized active learning for collaborative filtering , 2008, SIGIR '08.

[3]  Luo Si,et al.  A Bayesian Approach toward Active Learning for Collaborative Filtering , 2004, UAI.

[4]  Craig Boutilier,et al.  Active Collaborative Filtering , 2002, UAI.

[5]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[6]  Lawrence Carin,et al.  Active learning for online bayesian matrix factorization , 2012, KDD.

[7]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[8]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[9]  Yuhong Guo,et al.  Active Instance Sampling via Matrix Partition , 2010, NIPS.

[10]  David Saad,et al.  Learning from queries for maximum information gain in imperfectly learnable problems , 1994, NIPS.

[11]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[12]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[13]  Trevor J. Hastie,et al.  Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso , 2011, J. Mach. Learn. Res..

[14]  Yin Zhang,et al.  Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm , 2012, Mathematical Programming Computation.

[15]  Lars Schmidt-Thieme,et al.  Exploiting the characteristics of matrix factorization for active learning in recommender systems , 2012, RecSys.

[16]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[17]  Dale Schuurmans,et al.  Discriminative Batch Mode Active Learning , 2007, NIPS.

[18]  Lars Schmidt-Thieme,et al.  Comparing Prediction Models for Active Learning in Recommender Systems , 2015, LWA.

[19]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[20]  Lars Schmidt-Thieme,et al.  Non-myopic active learning for recommender systems based on Matrix Factorization , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[21]  Gerald Tesauro,et al.  Active Collaborative Prediction with Maximum Margin Matrix Factorization , 2008, ISAIM.

[22]  Rong Jin,et al.  Semi-supervised SVM batch mode active learning for image retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[24]  Robert D. Nowak,et al.  Transduction with Matrix Completion: Three Birds with One Stone , 2010, NIPS.

[25]  Rómer Rosales,et al.  Active Sensing , 2009, AISTATS.

[26]  Russell Greiner,et al.  Optimistic Active-Learning Using Mutual Information , 2007, IJCAI.

[27]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[28]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[29]  Rong Jin,et al.  Large-scale text categorization by batch mode active learning , 2006, WWW '06.

[30]  Peter Bühlmann,et al.  Missing values: sparse inverse covariance estimation and an extension to sparse regression , 2009, Statistics and Computing.

[31]  Lars Schmidt-Thieme,et al.  Towards Optimal Active Learning for Matrix Factorization in Recommender Systems , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[32]  Rasoul Karimi,et al.  Active Learning for Recommender Systems , 2015, KI - Künstliche Intelligenz.

[33]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[34]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Prasad Tadepalli,et al.  Active Learning with Committees for Text Categorization , 1997, AAAI/IAAI.

[36]  Trevor Hastie,et al.  Imputing Missing Data for Gene Expression Arrays , 2001 .