A Multi-Batch L-BFGS Method for Machine Learning
暂无分享,去创建一个
[1] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[2] Weizhu Chen,et al. Large-scale L-BFGS using MapReduce , 2014, NIPS.
[3] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[4] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[5] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[6] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.
[7] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[8] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[9] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[10] Yu-Hong Dai,et al. Convergence Properties of the BFGS Algoritm , 2002, SIAM J. Optim..
[11] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.
[12] John Langford,et al. A reliable effective terascale linear learning system , 2011, J. Mach. Learn. Res..
[13] Avleen Singh Bijral,et al. Mini-Batch Primal and Dual Methods for SVMs , 2013, ICML.
[14] Dimitris S. Papailiopoulos,et al. Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..
[15] J. Nocedal,et al. Exact and Inexact Subsampled Newton Methods for Optimization , 2016, 1609.08502.
[16] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[17] H. Robbins. A Stochastic Approximation Method , 1951 .
[18] Yuchen Zhang,et al. DiSCO: Distributed Optimization for Self-Concordant Empirical Loss , 2015, ICML.
[19] D. Bertsekas,et al. Convergen e Rate of In remental Subgradient Algorithms , 2000 .
[20] Walter F. Mascarenhas,et al. The BFGS method with exact line searches fails for non-convex objective functions , 2004, Math. Program..
[21] Simon Günter,et al. A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.
[22] Yuchen Zhang,et al. Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss , 2015, ArXiv.
[23] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[24] Masao Fukushima,et al. On the Global Convergence of the BFGS Method for Nonconvex Unconstrained Optimization Problems , 2000, SIAM J. Optim..
[25] Jorge Nocedal,et al. A Stochastic Quasi-Newton Method for Large-Scale Optimization , 2014, SIAM J. Optim..
[26] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[27] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[28] Jorge Nocedal,et al. On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..