Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution
暂无分享,去创建一个
Yuxin Chen | Yuejie Chi | Cong Ma | Kaizheng Wang | Yuxin Chen | Yuejie Chi | Cong Ma | Kaizheng Wang
[1] M. Rudelson,et al. Hanson-Wright inequality and sub-gaussian concentration , 2013 .
[2] V. Koltchinskii,et al. Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.
[3] Yoram Bresler,et al. ADMiRA: Atomic Decomposition for Minimum Rank Approximation , 2009, IEEE Transactions on Information Theory.
[4] John D. Lafferty,et al. Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent , 2016, ArXiv.
[5] John Wright,et al. On the Global Geometry of Sphere-Constrained Sparse Blind Deconvolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Xiaodong Li,et al. Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization , 2016, Applied and Computational Harmonic Analysis.
[7] Tom Goldstein,et al. PhaseMax: Convex Phase Retrieval via Basis Pursuit , 2016, IEEE Transactions on Information Theory.
[8] Yonina C. Eldar,et al. Non-Convex Phase Retrieval From STFT Measurements , 2016, IEEE Transactions on Information Theory.
[9] Mahdi Soltanolkotabi,et al. Structured Signal Recovery From Quadratic Measurements: Breaking Sample Complexity Barriers via Nonconvex Optimization , 2017, IEEE Transactions on Information Theory.
[10] P. Wedin. Perturbation bounds in connection with singular value decomposition , 1972 .
[11] Trevor J. Hastie,et al. Matrix completion and low-rank SVD via fast alternating least squares , 2014, J. Mach. Learn. Res..
[12] A. Montanari,et al. The landscape of empirical risk for nonconvex losses , 2016, The Annals of Statistics.
[13] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..
[14] Lorenzo Rosasco,et al. Generalization Properties and Implicit Regularization for Multiple Passes SGM , 2016, ICML.
[15] Prateek Jain,et al. Fast Exact Matrix Completion with Finite Samples , 2014, COLT.
[16] Feng Ruan,et al. Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval , 2017, Information and Inference: A Journal of the IMA.
[17] Y. Bresler,et al. Blind gain and phase calibration for low-dimensional or sparse signal sensing via power iteration , 2017, 2017 International Conference on Sampling Theory and Applications (SampTA).
[18] Maxim Sviridenko,et al. Concentration and moment inequalities for polynomials of independent random variables , 2012, SODA.
[19] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.
[20] P. Bickel,et al. On robust regression with high-dimensional predictors , 2013, Proceedings of the National Academy of Sciences.
[21] Gang Wang,et al. Solving large-scale systems of random quadratic equations via stochastic truncated amplitude flow , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).
[22] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[23] Mary Wootters,et al. Fast matrix completion without the condition number , 2014, COLT.
[24] Liming Wang,et al. Blind Deconvolution From Multiple Sparse Inputs , 2016, IEEE Signal Processing Letters.
[25] Justin Romberg,et al. Fast and Guaranteed Blind Multichannel Deconvolution Under a Bilinear System Model , 2016, IEEE Transactions on Information Theory.
[26] Yudong Chen,et al. Incoherence-Optimal Matrix Completion , 2013, IEEE Transactions on Information Theory.
[27] Yuxin Chen,et al. Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval , 2018, Mathematical Programming.
[28] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[29] Andrea Montanari,et al. Fundamental Limits of Weak Recovery with Applications to Phase Retrieval , 2017, COLT.
[30] Cong Ma,et al. A Selective Overview of Deep Learning , 2019, Statistical science : a review journal of the Institute of Mathematical Statistics.
[31] Prateek Jain,et al. Non-convex Robust PCA , 2014, NIPS.
[32] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.
[33] Yonina C. Eldar,et al. Phase retrieval from STFT measurements via non-convex optimization , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Nicolas Boumal,et al. Near-Optimal Bounds for Phase Synchronization , 2017, SIAM J. Optim..
[35] J. Tanner,et al. Low rank matrix completion by alternating steepest descent methods , 2016 .
[36] Xiaodong Li,et al. Solving Quadratic Equations via PhaseLift When There Are About as Many Equations as Unknowns , 2012, Found. Comput. Math..
[37] Sujay Sanghavi,et al. The Local Convexity of Solving Systems of Quadratic Equations , 2015, 1506.07868.
[38] Yonina C. Eldar,et al. Sparsity Based Sub-wavelength Imaging with Partially Incoherent Light via Quadratic Compressed Sensing References and Links , 2022 .
[39] N. Alon,et al. The Probabilistic Method: Alon/Probabilistic , 2008 .
[40] Tengyu Ma,et al. On the optimization landscape of tensor decompositions , 2017, Mathematical Programming.
[41] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..
[42] Chandler Davis. The rotation of eigenvectors by a perturbation , 1963 .
[43] Emmanuel J. Candès,et al. PhaseLift: Exact and Stable Signal Recovery from Magnitude Measurements via Convex Programming , 2011, ArXiv.
[44] Ken Kreutz-Delgado,et al. The Complex Gradient Operator and the CR-Calculus ECE275A - Lecture Supplement - Fall 2005 , 2009, 0906.4835.
[45] David Gross,et al. Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.
[46] Andrea J. Goldsmith,et al. Exact and Stable Covariance Estimation From Quadratic Sampling via Convex Programming , 2013, IEEE Transactions on Information Theory.
[47] Martin J. Wainwright,et al. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..
[48] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[49] Prateek Jain,et al. Thresholding Based Outlier Robust PCA , 2017, COLT.
[50] Yuxin Chen,et al. Nonconvex Matrix Factorization from Rank-One Measurements , 2019, AISTATS.
[51] Emmanuel J. Candès,et al. A Probabilistic and RIPless Theory of Compressed Sensing , 2010, IEEE Transactions on Information Theory.
[52] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.
[53] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[54] Yonina C. Eldar,et al. GESPAR: Efficient Phase Retrieval of Sparse Signals , 2013, IEEE Transactions on Signal Processing.
[55] Yue M. Lu,et al. Phase Transitions of Spectral Initialization for High-Dimensional Nonconvex Estimation , 2017, Information and Inference: A Journal of the IMA.
[56] Max Simchowitz,et al. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.
[57] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).
[58] Yonina C. Eldar,et al. Phase Retrieval: An Overview of Recent Developments , 2015, ArXiv.
[59] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[60] Yuling Yan,et al. Inference and uncertainty quantification for noisy matrix completion , 2019, Proceedings of the National Academy of Sciences.
[61] Emmanuel J. Candès,et al. The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.
[62] Noureddine El Karoui,et al. On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators , 2018 .
[63] Zhi-Quan Luo,et al. Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.
[64] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.
[65] Sundeep Rangan,et al. Compressive Phase Retrieval via Generalized Approximate Message Passing , 2014, IEEE Transactions on Signal Processing.
[66] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.
[67] Tengyao Wang,et al. A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.
[68] B. A. Schmitt. Perturbation bounds for matrix square roots and pythagorean sums , 1992 .
[69] Benjamin Recht,et al. A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..
[70] Gilad Lerman,et al. A Well-Tempered Landscape for Non-convex Robust Subspace Recovery , 2017, J. Mach. Learn. Res..
[71] Wen Huang,et al. Blind Deconvolution by a Steepest Descent Algorithm on a Quotient Manifold , 2017, SIAM J. Imaging Sci..
[72] Joel A. Tropp,et al. Convex recovery of a structured signal from independent random linear measurements , 2014, ArXiv.
[73] Gongguo Tang,et al. The nonconvex geometry of low-rank matrix optimizations with general objective functions , 2016, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[74] Yanjun Li,et al. Blind Recovery of Sparse Signals From Subsampled Convolution , 2015, IEEE Transactions on Information Theory.
[75] Abhay Pasupathy,et al. On the Global Geometry of Sphere-Constrained Sparse Blind Deconvolution , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[76] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[77] Yingbin Liang,et al. Provable Non-convex Phase Retrieval with Outliers: Median TruncatedWirtinger Flow , 2016, ICML.
[78] T. Tao. Topics in Random Matrix Theory , 2012 .
[79] Damek Davis,et al. The nonsmooth landscape of phase retrieval , 2017, IMA Journal of Numerical Analysis.
[80] Chen Cheng,et al. Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices , 2018, ArXiv.
[81] Laurent Jacques,et al. A non-convex blind calibration method for randomised sensing strategies , 2016, 2016 4th International Workshop on Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing (CoSeRa).
[82] A. Mukherjea,et al. Real and Functional Analysis , 1978 .
[83] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.
[84] Yonina C. Eldar,et al. Convolutional Phase Retrieval via Gradient Descent , 2017, IEEE Transactions on Information Theory.
[85] J. Berge,et al. Orthogonal procrustes rotation for two or more matrices , 1977 .
[86] Yingbin Liang,et al. A Nonconvex Approach for Phase Retrieval: Reshaped Wirtinger Flow and Incremental Algorithms , 2017, J. Mach. Learn. Res..
[87] N. Higham. Estimating the matrixp-norm , 1992 .
[88] Yonina C. Eldar,et al. Solving Systems of Random Quadratic Equations via Truncated Amplitude Flow , 2016, IEEE Transactions on Information Theory.
[89] Sujay Sanghavi,et al. The Local Convexity of Solving Quadratic Equations , 2015 .
[90] Joel A. Tropp,et al. An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..
[91] Constantine Caramanis,et al. A Convex Formulation for Mixed Regression: Near Optimal Rates in the Face of Noise , 2013, ArXiv.
[92] Yuxin Chen,et al. The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences , 2016, Communications on Pure and Applied Mathematics.
[93] John Wright,et al. Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.
[94] Bing Gao,et al. Phaseless Recovery Using the Gauss–Newton Method , 2016, IEEE Transactions on Signal Processing.
[95] Andrea Montanari,et al. Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.
[96] Anastasios Kyrillidis,et al. Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach , 2016, AISTATS.
[97] Dacheng Tao,et al. Algorithmic Stability and Hypothesis Complexity , 2017, ICML.
[98] W. Kahan,et al. The Rotation of Eigenvectors by a Perturbation. III , 1970 .
[99] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).
[100] Thomas Strohmer,et al. Regularized Gradient Descent: A Nonconvex Recipe for Fast Joint Blind Deconvolution and Demixing , 2017, ArXiv.
[101] Justin K. Romberg,et al. An Overview of Low-Rank Matrix Recovery From Incomplete Observations , 2016, IEEE Journal of Selected Topics in Signal Processing.
[102] Pablo A. Parrilo,et al. Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..
[103] Yuling Yan,et al. Noisy Matrix Completion: Understanding Statistical Guarantees for Convex Relaxation via Nonconvex Optimization , 2019, SIAM J. Optim..
[104] Junwei Lu,et al. Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization , 2016, ArXiv.
[105] Gang Wang,et al. Solving Almost all Systems of Random Quadratic Equations , 2017, NIPS 2017.
[106] Jianqing Fan,et al. ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK. , 2017, Annals of statistics.
[107] A. Fannjiang,et al. Phase Retrieval with One or Two Diffraction Patterns by Alternating Projections with the Null Initialization , 2015, 1510.07379.
[108] Ke Wei. Solving systems of phaseless equations via Kaczmarz methods: a proof of concept study , 2015 .
[109] Gang Wang,et al. Sparse Phase Retrieval via Truncated Amplitude Flow , 2016, IEEE Transactions on Signal Processing.
[110] Adel Javanmard,et al. Debiasing the lasso: Optimal sample size for Gaussian designs , 2015, The Annals of Statistics.
[111] Yuxin Chen,et al. Solving Random Quadratic Systems of Equations Is Nearly as Easy as Solving Linear Systems , 2015, NIPS.
[112] Anru Zhang,et al. ROP: Matrix Recovery via Rank-One Projections , 2013, ArXiv.
[113] Yuxin Chen,et al. Spectral Method and Regularized MLE Are Both Optimal for Top-$K$ Ranking , 2017, Annals of statistics.
[114] R. Mathias. Perturbation Bounds for the Polar Decomposition , 1997 .
[115] V. Koltchinskii,et al. Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .
[116] Yuejie Chi,et al. Kaczmarz Method for Solving Quadratic Equations , 2016, IEEE Signal Processing Letters.
[117] Christos Thrampoulidis,et al. Phase retrieval via linear programming: Fundamental limits and algorithmic improvements , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[118] Sham M. Kakade,et al. A tail inequality for quadratic forms of subgaussian random vectors , 2011, ArXiv.
[119] Sham M. Kakade,et al. Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent , 2016, NIPS.
[120] F. M. Dopico. A Note on Sin Θ Theorems for Singular Subspace Variations , 2000 .
[121] Martin J. Wainwright,et al. Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees , 2015, ArXiv.
[122] Justin K. Romberg,et al. Blind Deconvolution Using Convex Programming , 2012, IEEE Transactions on Information Theory.
[123] John D. Lafferty,et al. A Convergent Gradient Descent Algorithm for Rank Minimization and Semidefinite Programming from Random Linear Measurements , 2015, NIPS.
[124] Noureddine El Karoui. On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators , 2018 .
[125] Vladislav Voroninski,et al. An Elementary Proof of Convex Phase Retrieval in the Natural Parameter Space via the Linear Program PhaseMax , 2016, ArXiv.
[126] Ali Ahmed,et al. BranchHull: Convex bilinear inversion from the entrywise product of signals with known signs , 2017, Applied and Computational Harmonic Analysis.
[127] Xiaodong Li,et al. Optimal Rates of Convergence for Noisy Sparse Phase Retrieval via Thresholded Wirtinger Flow , 2015, ArXiv.
[128] Gang Wang,et al. Solving Most Systems of Random Quadratic Equations , 2017, NIPS.
[129] Tony F. Chan,et al. Guarantees of Riemannian Optimization for Low Rank Matrix Recovery , 2015, SIAM J. Matrix Anal. Appl..
[130] Dong Wang,et al. Distributed estimation of principal eigenspaces. , 2017, Annals of statistics.
[131] Nicholas J. Highamy. Estimating the matrix p-norm , 1992 .
[132] Zhaoran Wang,et al. A Nonconvex Optimization Framework for Low Rank Matrix Estimation , 2015, NIPS.
[133] Xiaodong Li,et al. Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.
[134] Roy Mathias,et al. The spectral norm of a nonnegative matrix , 1990 .
[135] Yonina C. Eldar,et al. Phase Retrieval via Matrix Completion , 2011, SIAM Rev..
[136] Yuxin Chen,et al. Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview , 2018, IEEE Transactions on Signal Processing.
[137] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[138] Thomas Strohmer,et al. Self-calibration and biconvex compressive sensing , 2015, ArXiv.
[139] Yan Shuo Tan,et al. Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees , 2017, ArXiv.
[140] Gilad Lerman,et al. Fast, Robust and Non-convex Subspace Recovery , 2014, 1406.6145.
[141] Yang Wang,et al. Fast Rank-One Alternating Minimization Algorithm for Phase Retrieval , 2017, Journal of Scientific Computing.
[142] Andrea Montanari,et al. Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..
[143] Prateek Jain,et al. Phase Retrieval Using Alternating Minimization , 2013, IEEE Transactions on Signal Processing.
[144] Ayfer Özgür,et al. Phase Retrieval via Incremental Truncated Wirtinger Flow , 2016, ArXiv.
[145] Justin Romberg,et al. Phase Retrieval Meets Statistical Learning Theory: A Flexible Convex Relaxation , 2016, AISTATS.
[146] Yuejie Chi,et al. Guaranteed Blind Sparse Spikes Deconvolution via Lifting and Convex Optimization , 2015, IEEE Journal of Selected Topics in Signal Processing.
[147] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[148] Yuxin Chen,et al. The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square , 2017, Probability Theory and Related Fields.