A disagreement-based active matrix completion approach with provable guarantee

Active matrix completion (AMC) is an effective approach to improve the performance of matrix completion. It actively acquires certain missing entries of a target matrix, with the aim of quickly improving the completion accuracy of the rest. Although this topic is attracting an increasing attention, all existing solutions are heuristic. In this paper, we propose a new active matrix completion approach called Factor-Disagreed AMC (FDAMC), with provably performance guarantee. It is an extension of the popular disagreement-based active statistical learning to the matrix completion task, at both methodological and theoretical levels. Specifically, FDAMC learns two factorization models to complete the matrix, and acquires those missing entries disagreed by the two models in the completion. By employing and modifying the PAC theory, we prove the sample complexity of this approach. A FDAMC algorithm is also presented, with some heuristics included. Finally, we present proof-of-concept experiments and demonstrate the effectiveness of FDAMC on both synthetic and real-world data sets.

[1]  Chao Lan,et al.  Reducing the Unlabeled Sample Complexity of Semi-Supervised Multi-View Learning , 2015, KDD.

[2]  Nagarajan Natarajan,et al.  Inductive matrix completion for predicting gene–disease associations , 2014, Bioinform..

[3]  Lawrence Carin,et al.  Active learning for online bayesian matrix factorization , 2012, KDD.

[4]  Steve Hanneke,et al.  Theory of Disagreement-Based Active Learning , 2014, Found. Trends Mach. Learn..

[5]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[6]  Lars Schmidt-Thieme,et al.  Towards Optimal Active Learning for Matrix Factorization in Recommender Systems , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[7]  Fei Wang,et al.  From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records , 2014, KDD.

[8]  Noga Alon,et al.  Generalization Error Bounds for Collaborative Prediction with Low-Rank Matrices , 2004, NIPS.

[9]  Marco Wiering,et al.  2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) , 2011, IJCNN 2011.

[10]  Craig Boutilier,et al.  Active Collaborative Filtering , 2002, UAI.

[11]  Yiming Yang,et al.  Personalized active learning for collaborative filtering , 2008, SIGIR '08.

[12]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[13]  Luo Si,et al.  A Bayesian Approach toward Active Learning for Collaborative Filtering , 2004, UAI.

[14]  Gerald Tesauro,et al.  Active Collaborative Prediction with Maximum Margin Matrix Factorization , 2008, ISAIM.

[15]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[16]  Yaozong Gao,et al.  A transversal approach for patch-based label fusion via matrix completion , 2015, Medical Image Anal..

[17]  Lars Schmidt-Thieme,et al.  Non-myopic active learning for recommender systems based on Matrix Factorization , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[18]  Zhi-Hua Zhou When semi-supervised learning meets ensemble learning , 2011 .

[19]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[20]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[21]  Sanjoy Dasgupta,et al.  PAC Generalization Bounds for Co-training , 2001, NIPS.

[22]  Zhi-Hua Zhou,et al.  On multi-view active learning and the combination with semi-supervised learning , 2008, ICML '08.

[23]  Chao Lan,et al.  Partial Collective Matrix Factorization and its PAC Bound , 2016, ISAIM.

[24]  Peter Kulchyski and , 2015 .

[25]  Lars Schmidt-Thieme,et al.  Exploiting the characteristics of matrix factorization for active learning in recommender systems , 2012, RecSys.

[26]  Maria-Florina Balcan,et al.  Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.

[27]  Massimiliano Pontil,et al.  Generalization Bounds for K-Dimensional Coding Schemes in Hilbert Spaces , 2008, ALT.

[28]  Qiang Yang,et al.  One-Class Collaborative Filtering , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[29]  Mark Crovella,et al.  Matrix Completion with Queries , 2015, KDD.

[30]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[31]  Massimiliano Pontil,et al.  $K$ -Dimensional Coding Schemes in Hilbert Spaces , 2010, IEEE Transactions on Information Theory.

[32]  Lorenzo Rosasco,et al.  On the Sample Complexity of Subspace Learning , 2013, NIPS.

[33]  Jiayu Zhou,et al.  Active Matrix Completion , 2013, 2013 IEEE 13th International Conference on Data Mining.

[34]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[35]  Barnabás Póczos,et al.  Active learning and search on low-rank matrices , 2013, KDD.

[36]  Min Xiao,et al.  Semi-Supervised Matrix Completion for Cross-Lingual Text Classification , 2014, AAAI.

[37]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..