论文信息 - 1-Bit Matrix Completion

1-Bit Matrix Completion

In this paper we develop a theory of matrix completion for the extreme case of noisy 1-bit observations. Instead of observing a subset of the real-valued entries of a matrix M, we obtain a small number of binary (1-bit) measurements generated according to a probability distribution determined by the real-valued entries of M. The central question we ask is whether or not it is possible to obtain an accurate estimate of M from this data. In general this would seem impossible, but we show that the maximum likelihood estimate under a suitable constraint returns an accurate estimate of M when ||M||_{\infty} <= \alpha, and rank(M) <= r. If the log-likelihood is a concave function (e.g., the logistic or probit observation models), then we can obtain this maximum likelihood estimate by optimizing a convex program. In addition, we also show that if instead of recovering M we simply wish to obtain an estimate of the distribution generating the 1-bit measurements, then we can eliminate the requirement that ||M||_{\infty} <= \alpha. For both cases, we provide lower bounds showing that these estimates are near-optimal. We conclude with a suite of experiments that both verify the implications of our theorems as well as illustrate some of the practical applications of 1-bit matrix completion. In particular, we compare our program to standard matrix completion methods on movie rating data in which users submit ratings from 1 to 5. In order to use our program, we quantize this data to a single bit, but we allow the standard matrix completion program to have access to the original ratings (from 1 to 5). Surprisingly, the approach based on binary data performs significantly better.

[1] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[2] G. A. Miller. THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[3] Yoram Wind,et al. Multiattribute decisions in marketing : a measurement approach , 1973 .

[4] I. Spence,et al. Single subject incomplete designs for nonmetric multidimensional scaling , 1974 .

[5] R. Glowinski,et al. Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires , 1975 .

[6] B. Mercier,et al. A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[7] J. Borwein,et al. Two-Point Step Size Gradient Methods , 1988 .

[8] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[9] M. Talagrand,et al. Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[10] Douglas B. Terry,et al. Using collaborative filtering to weave an information tapestry , 1992, CACM.

[11] P. Groenen,et al. Modern multidimensional scaling , 1996 .

[12] Michael E. Tipping. Probabilistic Visualisation of High-Dimensional Binary Data , 1998, NIPS.

[13] Yoav Seginer,et al. The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.

[14] José Mario Martínez,et al. Nonmonotone Spectral Projected Gradient Methods on Convex Sets , 1999, SIAM J. Optim..

[15] Sanjoy Dasgupta,et al. A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[16] Lawrence K. Saul,et al. A Generalized Linear Model for Principal Component Analysis of Binary Data , 2003, AISTATS.

[17] Noga Alon,et al. Generalization Error Bounds for Collaborative Prediction with Low-Rank Matrices , 2004, NIPS.

[18] Tommi S. Jaakkola,et al. Maximum-Margin Matrix Factorization , 2004, NIPS.

[19] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[20] S. Boucheron,et al. Theory of classification : a survey of some recent advances , 2005 .

[21] E.J. Candes. Compressive Sampling , 2022 .

[22] David L Donoho,et al. Compressed sensing , 2006, IEEE Transactions on Information Theory.

[23] Jan de Leeuw,et al. Principal component analysis of binary data by iterated singular value decomposition , 2006, Comput. Stat. Data Anal..

[24] Yinyu Ye,et al. Semidefinite programming based algorithms for sensor network localization , 2006, TOSN.

[25] S. Geer. HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[26] Amit Singer,et al. A remark on global positioning from local distances , 2008, Proceedings of the National Academy of Sciences.

[27] F. Bunea. Honest variable selection in linear and logistic regression models via ℓ 1 and ℓ 1 + ℓ 2 penalization , 2008 .

[28] Michael P. Friedlander,et al. Probing the Pareto Frontier for Basis Pursuit Solutions , 2008, SIAM J. Sci. Comput..

[29] F. Bunea. Honest variable selection in linear and logistic regression models via $\ell_1$ and $\ell_1+\ell_2$ penalization , 2008, 0808.4051.

[30] P. Bühlmann,et al. The group lasso for logistic regression , 2008 .

[31] Richard G. Baraniuk,et al. 1-Bit compressive sensing , 2008, 2008 42nd Annual Conference on Information Sciences and Systems.

[32] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[33] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[34] Lieven Vandenberghe,et al. Interior-Point Method for Nuclear Norm Approximation with Application to System Identification , 2009, SIAM J. Matrix Anal. Appl..

[35] Martin J. Wainwright,et al. A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[36] Andrea Montanari,et al. Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[37] Francis R. Bach,et al. Self-concordant analysis for logistic regression , 2009, ArXiv.

[38] Andrea Montanari,et al. Matrix completion from a few entries , 2009, ISIT.

[39] Emmanuel J. Candès,et al. A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[40] Amit Singer,et al. Uniqueness of Low-Rank Matrix Completion by Rigidity Theory , 2009, SIAM J. Matrix Anal. Appl..

[41] Stephen Becker,et al. Quantum state tomography via compressed sensing. , 2009, Physical review letters.

[42] J. Lafferty,et al. High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[43] Stephen P. Boyd,et al. Compressed Sensing With Quantized Measurements , 2010, IEEE Signal Processing Letters.

[44] V. Koltchinskii,et al. Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[45] V. Koltchinskii. Von Neumann Entropy Penalization and Low Rank Matrix Estimation , 2010, 1009.2439.

[46] A. Tsybakov,et al. Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[47] G. Lecu'e,et al. Sharp oracle inequalities for the prediction of a high-dimensional matrix , 2010, 1008.4886.

[48] Mohammad Emtiyaz Khan,et al. Variational bounds for mixed-data factor analysis , 2010, NIPS.

[49] Robert D. Nowak,et al. Sample complexity for 1-bit compressed sensing and sparse classification , 2010, 2010 IEEE International Symposium on Information Theory.

[50] Emmanuel J. Candès,et al. The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[51] Emmanuel J. Candès,et al. Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[52] Ambuj Tewari,et al. Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity , 2009, AISTATS.

[53] Martin J. Wainwright,et al. Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.

[54] Emmanuel J. Candès,et al. Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..

[55] O. Klopp. Rank penalized estimators for high-dimensional matrices , 2011, 1104.1244.

[56] Emmanuel J. Candès,et al. How well can we estimate a sparse vector? , 2011, ArXiv.

[57] David Gross,et al. Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[58] David F. Gleich,et al. Rank aggregation via nuclear norm minimization , 2011, KDD.

[59] Benjamin Recht,et al. A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[60] S. Gaiffas,et al. High Dimensional Matrix Estimation With Unknown Variance Of The Noise , 2011, 1112.3055.

[61] Yaniv Plan,et al. One‐Bit Compressed Sensing by Linear Programming , 2011, ArXiv.

[62] Yaniv Plan,et al. One-bit compressed sensing with non-Gaussian measurements , 2012, ArXiv.

[63] Yonina C. Eldar,et al. Introduction to Compressed Sensing , 2022 .

[64] Richard G. Baraniuk,et al. Regime Change: Bit-Depth Versus Measurement-Rate in Compressive Sensing , 2011, IEEE Transactions on Signal Processing.

[65] Martin J. Wainwright,et al. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[66] Yaniv Plan,et al. Robust 1-bit Compressed Sensing and Sparse Logistic Regression: A Convex Programming Approach , 2012, IEEE Transactions on Information Theory.

[67] Laurent Jacques,et al. Robust 1-Bit Compressive Sensing via Binary Stable Embeddings of Sparse Vectors , 2011, IEEE Transactions on Information Theory.

[68] Prateek Jain,et al. On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions , 2013, ICML.