Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes
暂无分享,去创建一个
[1] Christopher De Sa,et al. Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems , 2014, ICML.
[2] D. Aldous. Probability Approximations via the Poisson Clumping Heuristic , 1988 .
[3] Zhaoran Wang,et al. Nonconvex Statistical Optimization: Minimax-Optimal Sparse PCA in Polynomial Time , 2014, ArXiv.
[4] Zhi-Quan Luo,et al. Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.
[5] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..
[6] Zhaoran Wang,et al. Sparse PCA with Oracle Property , 2014, NIPS.
[7] Zhaoran Wang,et al. Low-Rank and Sparse Structure Pursuit via Alternating Minimization , 2016, AISTATS.
[8] David M. Blei,et al. A Variational Analysis of Stochastic Gradient Algorithms , 2016, ICML.
[9] John Wright,et al. When Are Nonconvex Problems Not Scary? , 2015, ArXiv.
[10] E Weinan,et al. Dynamics of Stochastic Gradient Algorithms , 2015, ArXiv.
[11] Moritz Hardt,et al. Understanding Alternating Minimization for Matrix Completion , 2013, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.
[12] John D. Lafferty,et al. A Convergent Gradient Descent Algorithm for Rank Minimization and Semidefinite Programming from Random Linear Measurements , 2015, NIPS.
[13] John Wright,et al. Complete Dictionary Recovery Over the Sphere II: Recovery by Riemannian Trust-Region Method , 2015, IEEE Transactions on Information Theory.
[14] Michael I. Jordan,et al. Gradient Descent Converges to Minimizers , 2016, ArXiv.
[15] Georgios Piliouras,et al. Gradient Descent Only Converges to Minimizers: Non-Isolated Critical Points and Invariant Regions , 2016, ITCS.
[16] Gene H. Golub,et al. Matrix computations , 1983 .
[17] Prateek Jain,et al. Fast Exact Matrix Completion with Finite Samples , 2014, COLT.
[18] John E. Moody,et al. Towards Faster Stochastic Gradient Search , 1991, NIPS.
[19] Hossein Mobahi,et al. Training Recurrent Neural Networks by Diffusion , 2016, ArXiv.
[20] Georgios Piliouras,et al. Gradient Descent Converges to Minimizers: The Case of Non-Isolated Critical Points , 2016, ArXiv.
[21] S. Shreve,et al. Stochastic differential equations , 1955, Mathematical Proceedings of the Cambridge Philosophical Society.
[22] V. Climenhaga. Markov chains and mixing times , 2013 .
[23] Yonina C. Eldar,et al. Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference , 2015, ArXiv.
[24] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).
[25] Sanjeev Arora,et al. Simple, Efficient, and Neural Algorithms for Sparse Coding , 2015, COLT.
[26] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.
[27] Stephen P. Boyd,et al. A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..
[28] Zhaoran Wang,et al. OPTIMAL COMPUTATIONAL AND STATISTICAL RATES OF CONVERGENCE FOR SPARSE NONCONVEX LEARNING PROBLEMS. , 2013, Annals of statistics.
[29] Martin J. Wainwright,et al. Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees , 2015, ArXiv.
[30] K. A. Semendyayev,et al. Handbook of mathematics , 1985 .
[31] John Wright,et al. Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.
[32] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[33] Guang Cheng,et al. Non-convex Statistical Optimization for Sparse Tensor Graphical Model , 2015, NIPS.
[34] Xiaodong Li,et al. Optimal Rates of Convergence for Noisy Sparse Phase Retrieval via Thresholded Wirtinger Flow , 2015, ArXiv.
[35] Kean Ming Tan,et al. Sparse generalized eigenvalue problem: optimal statistical rates via truncated Rayleigh flow , 2016, Journal of the Royal Statistical Society: Series B (Statistical Methodology).
[36] Po-Ling Loh,et al. Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima , 2013, J. Mach. Learn. Res..
[37] E Weinan,et al. Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms , 2015, ICML.
[38] Zhaoran Wang,et al. High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality , 2015, NIPS.
[39] B. Øksendal. Stochastic Differential Equations , 1985 .
[40] Tong Zhang,et al. Near-optimal stochastic approximation for online principal component estimation , 2016, Math. Program..
[41] Anima Anandkumar,et al. Efficient approaches for escaping higher order saddle points in non-convex optimization , 2016, COLT.
[42] Prateek Jain,et al. Phase Retrieval Using Alternating Minimization , 2013, IEEE Transactions on Signal Processing.
[43] M. Hirsch,et al. Differential Equations, Dynamical Systems, and an Introduction to Chaos , 2003 .
[44] Prateek Jain,et al. Computing Matrix Squareroot via Non Convex Local Search , 2015, ArXiv.
[45] Yuxin Chen,et al. Solving Random Quadratic Systems of Equations Is Nearly as Easy as Solving Linear Systems , 2015, NIPS.
[46] Max Simchowitz,et al. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.
[47] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[48] Anima Anandkumar,et al. Analyzing Tensor Power Method Dynamics in Overcomplete Regime , 2014, J. Mach. Learn. Res..
[49] S. Ethier,et al. Markov Processes: Characterization and Convergence , 2005 .
[50] R. Durrett. Probability: Theory and Examples , 1993 .
[51] D. W. Stroock,et al. Multidimensional Diffusion Processes , 1979 .
[52] Xi Chen,et al. Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..
[53] Sujay Sanghavi,et al. The Local Convexity of Solving Systems of Quadratic Equations , 2015, 1506.07868.
[54] Zhaoran Wang,et al. A Nonconvex Optimization Framework for Low Rank Matrix Estimation , 2015, NIPS.
[55] Xiaodong Li,et al. Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.
[56] Anastasios Kyrillidis,et al. Dropping Convexity for Faster Semi-definite Optimization , 2015, COLT.
[57] John Wright,et al. Finding a Sparse Vector in a Subspace: Linear Sparsity Using Alternating Directions , 2014, IEEE Transactions on Information Theory.
[58] Martin J. Wainwright,et al. Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.
[59] Prateek Jain,et al. Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization , 2013, SIAM J. Optim..
[60] Prateek Jain,et al. Non-convex Robust PCA , 2014, NIPS.
[61] Han Liu,et al. Provable sparse tensor decomposition , 2015, 1502.01425.