论文信息 - Matrix Completion and Related Problems via Strong Duality

Matrix Completion and Related Problems via Strong Duality

This work studies the strong duality of non-convex matrix factorization problems: we show that under certain dual conditions, these problems and its dual have the same optimum. This has been well understood for convex optimization, but little was known for non-convex problems. We propose a novel analytical framework and show that under certain dual conditions, the optimal solution of the matrix factorization program is the same as its bi-dual and thus the global optimality of the non-convex program can be achieved by solving its bi-dual which is convex. These dual conditions are satisfied by a wide class of matrix factorization problems, although matrix factorization problems are hard to solve in full generality. This analytical framework may be of independent interest to non-convex optimization more broadly. We apply our framework to two prototypical matrix factorization problems: matrix completion and robust Principal Component Analysis (PCA). These are examples of efficiently recovering a hidden matrix given limited reliable observations of it. Our framework shows that exact recoverability and strong duality hold with nearly-optimal sample complexity guarantees for matrix completion and robust PCA.

[1] Marie-Françoise Roy,et al. On the combinatorial and algebraic complexity of Quanti erEliminationS , 1994 .

[2] Jean Ponce,et al. Convex Sparse Matrix Factorizations , 2008, ArXiv.

[3] Anima Anandkumar,et al. Efficient approaches for escaping higher order saddle points in non-convex optimization , 2016, COLT.

[4] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[5] Emmanuel J. Candès,et al. The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[6] Ankur Moitra. An Almost Optimal Algorithm for Computing Nonnegative Rank , 2013, SODA.

[7] Ankur Moitra,et al. Algorithms and Hardness for Robust Subspace Recovery , 2012, COLT.

[8] Chao Zhang,et al. Completing Low-Rank Matrices With Corrupted Samples From Few Coefficients in General Basis , 2015, IEEE Transactions on Information Theory.

[9] René Vidal,et al. Global Optimality in Neural Network Training , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, International Symposium on Information Theory.

[11] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[12] Aditya Bhaskara,et al. Smoothed analysis of tensor decompositions , 2013, STOC.

[13] Parikshit Shah,et al. Compressed Sensing Off the Grid , 2012, IEEE Transactions on Information Theory.

[14] Avi Wigderson,et al. P = BPP if E requires exponential circuits: derandomizing the XOR lemma , 1997, STOC '97.

[15] Renato D. C. Monteiro,et al. Digital Object Identifier (DOI) 10.1007/s10107-004-0564-1 , 2004 .

[16] Andreas Goerdt,et al. An approximation hardness result for bipartite Clique , 2004, Electron. Colloquium Comput. Complex..

[17] R. Vershynin. Lectures in Geometric Functional Analysis , 2012 .

[18] R. Vershynin. Estimation in High Dimensions: A Geometric Perspective , 2014, 1405.5103.

[19] Constantine Caramanis,et al. Fast Algorithms for Robust PCA via Gradient Descent , 2016, NIPS.

[20] Reinhold Schneider,et al. Convergence Results for Projected Line-Search Methods on Varieties of Low-Rank Matrices Via Łojasiewicz Inequality , 2014, SIAM J. Optim..

[21] Nathan Srebro,et al. Concentration-Based Guarantees for Low-Rank Matrix Reconstruction , 2011, COLT.

[22] Junbin Gao,et al. Robust latent low rank representation for subspace clustering , 2014, Neurocomputing.

[23] Pablo A. Parrilo,et al. The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[24] Ola Svensson,et al. Inapproximability Results for Maximum Edge Biclique, Minimum Linear Arrangement, and Sparsest Cut , 2011, SIAM J. Comput..

[25] Patrick Seemann,et al. Matrix Factorization Techniques for Recommender Systems , 2014 .

[26] Andrea Montanari,et al. Matrix completion from a few entries , 2009, ISIT.

[27] Quan Li,et al. Matrix Completion from $O(n)$ Samples in Linear Time , 2017, COLT.

[28] Yuanzhi Li,et al. Recovery guarantee of weighted low-rank approximation via alternating minimization , 2016, ICML.

[29] James Renegar,et al. On the Computational Complexity and Geometry of the First-Order Theory of the Reals, Part II: The General Decision Problem. Preliminaries for Quantifier Elimination , 1992, J. Symb. Comput..

[30] Xiao Zhang,et al. A Nonconvex Free Lunch for Low-Rank plus Sparse Matrix Recovery , 2017 .

[31] Uriel Feige,et al. Resolution lower bounds for the weak pigeon hole principle , 2002, Proceedings 17th IEEE Annual Conference on Computational Complexity.

[32] Max Simchowitz,et al. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.

[33] Francis R. Bach,et al. Low-Rank Optimization on the Cone of Positive Semidefinite Matrices , 2008, SIAM J. Optim..

[34] Bingsheng He,et al. On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[35] Pablo A. Parrilo,et al. Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[36] Y. Zhang,et al. Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization , 2014, Optim. Methods Softw..

[37] Anastasios Kyrillidis,et al. Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach , 2016, AISTATS.

[38] Aditya Bhaskara,et al. More Algorithms for Provable Dictionary Learning , 2014, ArXiv.

[39] John D. Lafferty,et al. Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent , 2016, ArXiv.

[40] David P. Woodruff,et al. Relative Error Tensor Low Rank Approximation , 2017, Electron. Colloquium Comput. Complex..

[41] John Wright,et al. Complete Dictionary Recovery Over the Sphere II: Recovery by Riemannian Trust-Region Method , 2015, IEEE Transactions on Information Theory.

[42] David P. Woodruff,et al. Low rank approximation with entrywise l1-norm error , 2017, STOC.

[43] Yudong Chen,et al. Incoherence-Optimal Matrix Completion , 2013, IEEE Transactions on Information Theory.

[44] Prateek Jain,et al. Non-convex Robust PCA , 2014, NIPS.

[45] J. Renegar,et al. On the Computational Complexity and Geometry of the First-Order Theory of the Reals, Part I , 1989 .

[46] Yin Zhang,et al. Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm , 2012, Mathematical Programming Computation.

[47] Martin J. Wainwright,et al. Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees , 2015, ArXiv.

[48] Justin K. Romberg,et al. Blind Deconvolution Using Convex Programming , 2012, IEEE Transactions on Information Theory.

[49] M. Ledoux. The concentration of measure phenomenon , 2001 .

[50] Raghunandan H. Keshavan. Efficient algorithms for collaborative filtering , 2012 .

[51] Zhi-Quan Luo,et al. Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.

[52] Zhaoran Wang,et al. Low-Rank and Sparse Structure Pursuit via Alternating Minimization , 2016, AISTATS.

[53] James Renegar,et al. On the Computational Complexity and Geometry of the First-Order Theory of the Reals, Part I: Introduction. Preliminaries. The Geometry of Semi-Algebraic Sets. The Decision Problem for the Existential Theory of the Reals , 1992, J. Symb. Comput..

[54] Martin J. Wainwright,et al. Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions , 2011, ICML.

[55] René Vidal,et al. Structured Low-Rank Matrix Factorization: Optimality, Algorithm, and Applications to Image Processing , 2014, ICML.

[56] Anders Rantzer,et al. Low-Rank Optimization With Convex Constraints , 2016, IEEE Transactions on Automatic Control.

[57] Tengyu Ma,et al. Finding Approximate Local Minima for Nonconvex Optimization in Linear Time , 2016, ArXiv.

[58] Moritz Hardt,et al. Understanding Alternating Minimization for Matrix Completion , 2013, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[59] Ruslan Salakhutdinov,et al. Deep Neural Networks with Multi-Branch Architectures Are Intrinsically Less Non-Convex , 2019, AISTATS.

[60] Inderjit S. Dhillon,et al. Guaranteed Rank Minimization via Singular Value Projection , 2009, NIPS.

[61] Adi Shraibman,et al. Rank, Trace-Norm and Max-Norm , 2005, COLT.

[62] John D. Lafferty,et al. A Convergent Gradient Descent Algorithm for Rank Minimization and Semidefinite Programming from Random Linear Measurements , 2015, NIPS.

[63] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.

[64] Yu-Xiang Wang,et al. Stability of matrix factorization for collaborative filtering , 2012, ICML.

[65] David P. Woodruff,et al. Weighted low rank approximations with provable guarantees , 2016, STOC.

[66] René Vidal,et al. Global Optimality in Tensor Factorization, Deep Learning, and Beyond , 2015, ArXiv.

[67] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.

[68] Benjamin Recht,et al. A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[69] Maria-Florina Balcan,et al. Learning and 1-bit Compressed Sensing under Asymmetric Noise , 2016, COLT.

[70] John Wright,et al. Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.

[71] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[72] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[73] Yonina C. Eldar,et al. Strong Duality in Nonconvex Quadratic Optimization with Two Quadratic Constraints , 2006, SIAM J. Optim..

[74] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[75] Sanjeev Arora,et al. Computing a nonnegative matrix factorization -- provably , 2011, STOC '12.

[76] Yi Ma,et al. The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[77] Zhaoran Wang,et al. A Nonconvex Optimization Framework for Low Rank Matrix Estimation , 2015, NIPS.

[78] Anastasios Kyrillidis,et al. Dropping Convexity for Faster Semi-definite Optimization , 2015, COLT.

[79] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.

[80] Junbin Gao,et al. Relations Among Some Low-Rank Subspace Recovery Models , 2014, Neural Computation.

[81] Zeyuan Allen-Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..

[82] J. Jahn. Introduction to the Theory of Nonlinear Optimization , 1994 .

[83] Zhouchen Lin,et al. Low-Rank Models in Visual Analysis: Theories, Algorithms, and Applications , 2017 .

[84] Emmanuel J. Candès,et al. Simple bounds for recovering low-complexity models , 2011, Math. Program..

[85] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[86] Prasad Raghavendra,et al. Computational Limits for Matrix Completion , 2014, COLT.

[87] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.

[88] David P. Woodruff,et al. Testing Matrix Rank, Optimally , 2018, SODA.

[89] Chao Zhang,et al. A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank , 2013, ECML/PKDD.

[90] David Gross,et al. Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[91] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.

[92] Martin J. Wainwright,et al. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[93] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[94] Andrea Montanari,et al. Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[95] Sajid Javed,et al. On the Applications of Robust PCA in Image and Video Processing , 2018, Proceedings of the IEEE.

[96] Nathan Srebro,et al. Learning with matrix factorizations , 2004 .

[97] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[98] Stephen P. Boyd,et al. Generalized Low Rank Models , 2014, Found. Trends Mach. Learn..

[99] Edward Y. Chang,et al. Exact Recoverability of Robust PCA via Outlier Pursuit with Tight Recovery Bounds , 2015, AAAI.

[100] Maria-Florina Balcan,et al. Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling , 2016, NIPS.

[101] Michael L. Overton,et al. On the Sum of the Largest Eigenvalues of a Symmetric Matrix , 1992, SIAM J. Matrix Anal. Appl..