1-Bit Matrix Completion

In this paper we develop a theory of matrix completion for the extreme case of noisy 1-bit observations. Instead of observing a subset of the real-valued entries of a matrix M, we obtain a small number of binary (1-bit) measurements generated according to a probability distribution determined by the real-valued entries of M. The central question we ask is whether or not it is possible to obtain an accurate estimate of M from this data. In general this would seem impossible, but we show that the maximum likelihood estimate under a suitable constraint returns an accurate estimate of M when ||M||_{\infty} <= \alpha, and rank(M) <= r. If the log-likelihood is a concave function (e.g., the logistic or probit observation models), then we can obtain this maximum likelihood estimate by optimizing a convex program. In addition, we also show that if instead of recovering M we simply wish to obtain an estimate of the distribution generating the 1-bit measurements, then we can eliminate the requirement that ||M||_{\infty} <= \alpha. For both cases, we provide lower bounds showing that these estimates are near-optimal. We conclude with a suite of experiments that both verify the implications of our theorems as well as illustrate some of the practical applications of 1-bit matrix completion. In particular, we compare our program to standard matrix completion methods on movie rating data in which users submit ratings from 1 to 5. In order to use our program, we quantize this data to a single bit, but we allow the standard matrix completion program to have access to the original ratings (from 1 to 5). Surprisingly, the approach based on binary data performs significantly better.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[3]  Yoram Wind,et al.  Multiattribute decisions in marketing : a measurement approach , 1973 .

[4]  I. Spence,et al.  Single subject incomplete designs for nonmetric multidimensional scaling , 1974 .

[5]  R. Glowinski,et al.  Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires , 1975 .

[6]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[7]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[10]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[11]  P. Groenen,et al.  Modern multidimensional scaling , 1996 .

[12]  Michael E. Tipping Probabilistic Visualisation of High-Dimensional Binary Data , 1998, NIPS.

[13]  Yoav Seginer,et al.  The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.

[14]  José Mario Martínez,et al.  Nonmonotone Spectral Projected Gradient Methods on Convex Sets , 1999, SIAM J. Optim..

[15]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[16]  Lawrence K. Saul,et al.  A Generalized Linear Model for Principal Component Analysis of Binary Data , 2003, AISTATS.

[17]  Noga Alon,et al.  Generalization Error Bounds for Collaborative Prediction with Low-Rank Matrices , 2004, NIPS.

[18]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[19]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[20]  S. Boucheron,et al.  Theory of classification : a survey of some recent advances , 2005 .

[21]  E.J. Candes Compressive Sampling , 2022 .

[22]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[23]  Jan de Leeuw,et al.  Principal component analysis of binary data by iterated singular value decomposition , 2006, Comput. Stat. Data Anal..

[24]  Yinyu Ye,et al.  Semidefinite programming based algorithms for sensor network localization , 2006, TOSN.

[25]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[26]  Amit Singer,et al.  A remark on global positioning from local distances , 2008, Proceedings of the National Academy of Sciences.

[27]  F. Bunea Honest variable selection in linear and logistic regression models via ℓ 1 and ℓ 1 + ℓ 2 penalization , 2008 .

[28]  Michael P. Friedlander,et al.  Probing the Pareto Frontier for Basis Pursuit Solutions , 2008, SIAM J. Sci. Comput..

[29]  F. Bunea Honest variable selection in linear and logistic regression models via $\ell_1$ and $\ell_1+\ell_2$ penalization , 2008, 0808.4051.

[30]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[31]  Richard G. Baraniuk,et al.  1-Bit compressive sensing , 2008, 2008 42nd Annual Conference on Information Sciences and Systems.

[32]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[33]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[34]  Lieven Vandenberghe,et al.  Interior-Point Method for Nuclear Norm Approximation with Application to System Identification , 2009, SIAM J. Matrix Anal. Appl..

[35]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[36]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[37]  Francis R. Bach,et al.  Self-concordant analysis for logistic regression , 2009, ArXiv.

[38]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, ISIT.

[39]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[40]  Amit Singer,et al.  Uniqueness of Low-Rank Matrix Completion by Rigidity Theory , 2009, SIAM J. Matrix Anal. Appl..

[41]  Stephen Becker,et al.  Quantum state tomography via compressed sensing. , 2009, Physical review letters.

[42]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[43]  Stephen P. Boyd,et al.  Compressed Sensing With Quantized Measurements , 2010, IEEE Signal Processing Letters.

[44]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[45]  V. Koltchinskii Von Neumann Entropy Penalization and Low Rank Matrix Estimation , 2010, 1009.2439.

[46]  A. Tsybakov,et al.  Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.

[47]  G. Lecu'e,et al.  Sharp oracle inequalities for the prediction of a high-dimensional matrix , 2010, 1008.4886.

[48]  Mohammad Emtiyaz Khan,et al.  Variational bounds for mixed-data factor analysis , 2010, NIPS.

[49]  Robert D. Nowak,et al.  Sample complexity for 1-bit compressed sensing and sparse classification , 2010, 2010 IEEE International Symposium on Information Theory.

[50]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[51]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[52]  Ambuj Tewari,et al.  Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity , 2009, AISTATS.

[53]  Martin J. Wainwright,et al.  Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.

[54]  Emmanuel J. Candès,et al.  Templates for convex cone problems with applications to sparse signal recovery , 2010, Math. Program. Comput..

[55]  O. Klopp Rank penalized estimators for high-dimensional matrices , 2011, 1104.1244.

[56]  Emmanuel J. Candès,et al.  How well can we estimate a sparse vector? , 2011, ArXiv.

[57]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[58]  David F. Gleich,et al.  Rank aggregation via nuclear norm minimization , 2011, KDD.

[59]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[60]  S. Gaiffas,et al.  High Dimensional Matrix Estimation With Unknown Variance Of The Noise , 2011, 1112.3055.

[61]  Yaniv Plan,et al.  One‐Bit Compressed Sensing by Linear Programming , 2011, ArXiv.

[62]  Yaniv Plan,et al.  One-bit compressed sensing with non-Gaussian measurements , 2012, ArXiv.

[63]  Yonina C. Eldar,et al.  Introduction to Compressed Sensing , 2022 .

[64]  Richard G. Baraniuk,et al.  Regime Change: Bit-Depth Versus Measurement-Rate in Compressive Sensing , 2011, IEEE Transactions on Signal Processing.

[65]  Martin J. Wainwright,et al.  Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[66]  Yaniv Plan,et al.  Robust 1-bit Compressed Sensing and Sparse Logistic Regression: A Convex Programming Approach , 2012, IEEE Transactions on Information Theory.

[67]  Laurent Jacques,et al.  Robust 1-Bit Compressive Sensing via Binary Stable Embeddings of Sparse Vectors , 2011, IEEE Transactions on Information Theory.

[68]  Prateek Jain,et al.  On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions , 2013, ICML.