论文信息 - GS-OPT: A new fast stochastic algorithm for solving the non-convex optimization problem

GS-OPT: A new fast stochastic algorithm for solving the non-convex optimization problem

Non-convex optimization has an important role in machine learning. However, the theoretical understanding of non-convex optimization remained rather limited. Studying efficient algorithms for non-convex optimization has attracted a great deal of attention from many researchers around the world but these problems are usually NP-hard to solve. In this paper, we have proposed a new algorithm namely GS-OPT (General Stochastic OPTimization) which is effective for solving the non-convex problems. Our idea is to combine two stochastic bounds of the objective function where they are made by a commonly discrete probability distribution namely Bernoulli. We consider GS-OPT carefully on both the theoretical and experimental aspects. We also apply GS-OPT for solving the posterior inference problem in the latent Dirichlet allocation. Empirical results show that our approach is often more efficient than previous ones.

[1] Yuanzhi Li,et al. Neon2: Finding Local Minima via First-Order Oracles , 2017, NeurIPS.

[2] Julien Mairal,et al. Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization , 2013, NIPS.

[3] David M. Blei,et al. Sparse stochastic inference for latent Dirichlet allocation , 2012, ICML.

[4] Andrew Blake,et al. Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[5] Alan L. Yuille,et al. The Concave-Convex Procedure , 2003, Neural Computation.

[6] Roland Badeau,et al. Stochastic Quasi-Newton Langevin Monte Carlo , 2016, ICML.

[7] Timothy Baldwin,et al. Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality , 2014, EACL.

[8] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[9] GhadimiSaeed,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2016 .

[10] Lin Xiao,et al. A Proximal Stochastic Gradient Method with Progressive Variance Reduction , 2014, SIAM J. Optim..

[11] Zeyuan Allen-Zhu,et al. Natasha 2: Faster Non-Convex Optimization Than SGD , 2017, NeurIPS.

[12] Mingyi Hong,et al. On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization , 2018, ICLR.

[13] Yee Whye Teh,et al. Collapsed Variational Inference for HDP , 2007, NIPS.

[14] Daniel M. Roy,et al. Complexity of Inference in Latent Dirichlet Allocation , 2011, NIPS.

[15] Shai Shalev-Shwartz,et al. On Graduated Optimization for Stochastic Non-Convex Problems , 2015, ICML.

[16] Khoat Than,et al. Guaranteed inference in topic models , 2015, 1512.03308.

[17] Mark Steyvers,et al. Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18] Christopher J. Hillar,et al. Most Tensor Problems Are NP-Hard , 2009, JACM.

[19] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[20] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[21] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..

[23] Elad Hazan,et al. Projection-free Online Learning , 2012, ICML.

[24] Yee Whye Teh,et al. On Smoothing and Inference for Topic Models , 2009, UAI.

[25] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[26] Le Song,et al. Provable Bayesian Inference via Particle Mirror Descent , 2015, AISTATS.

[27] Saeed Ghadimi,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.