A Distributed Second-Order Algorithm You Can Trust
暂无分享,去创建一个
Thomas Hofmann | Aurélien Lucchi | Martin Jaggi | Celestine Dünner | Matilde Gargiani | An Bian | Martin Jaggi | Thomas Hofmann | Aurélien Lucchi | Celestine Mendler-Dünner | Matilde Gargiani | An Bian
[1] Nicholas I. M. Gould,et al. Trust Region Methods , 2000, MOS-SIAM Series on Optimization.
[2] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..
[3] Jianfeng Gao,et al. Scalable training of L1-regularized log-linear models , 2007, ICML '07.
[4] Francis R. Bach,et al. Self-concordant analysis for logistic regression , 2009, ArXiv.
[5] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[6] Nicholas I. M. Gould,et al. Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results , 2011, Math. Program..
[7] Nicholas I. M. Gould,et al. Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity , 2011, Math. Program..
[8] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..
[9] Tianbao Yang,et al. Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent , 2013, NIPS.
[10] Thomas Hofmann,et al. Communication-Efficient Distributed Dual Coordinate Ascent , 2014, NIPS.
[11] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[12] Tianbao Yang,et al. Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity , 2015 .
[13] Yuchen Zhang,et al. DiSCO: Distributed Optimization for Self-Concordant Empirical Loss , 2015, ICML.
[14] Martin Jaggi,et al. Primal-Dual Rates and Certificates , 2016, ICML.
[15] Peter Richtárik,et al. Distributed Coordinate Descent Method for Learning with Big Data , 2013, J. Mach. Learn. Res..
[16] Inderjit S. Dhillon,et al. Communication-Efficient Parallel Block Minimization for Kernel Machines , 2016, ArXiv.
[17] Alexander J. Smola,et al. AIDE: Fast and Communication Efficient Distributed Optimization , 2016, ArXiv.
[18] Tong Zhang,et al. A General Distributed Dual Coordinate Optimization Framework for Regularized Loss Minimization , 2016, J. Mach. Learn. Res..
[19] Michael I. Jordan,et al. CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..
[20] Matilde Gargiani. Hessian-CoCoA : a general parallel and distributed framework for non-strongly convex regularizers , 2017 .
[21] Ilya Trofimov,et al. Distributed coordinate descent for generalized linear models with regularization , 2017, Pattern Recognition and Image Analysis.
[22] S. Sundararajan,et al. A distributed block coordinate descent method for training $l_1$ regularized linear classifiers , 2014, J. Mach. Learn. Res..
[23] Shusen Wang,et al. GIANT: Globally Improved Approximate Newton Method for Distributed Optimization , 2017, NeurIPS.
[24] Stephen J. Wright,et al. A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization , 2018, KDD.
[25] Martin Jaggi,et al. Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients , 2018, ArXiv.
[26] Stephen J. Wright,et al. Inexact Successive quadratic approximation for regularized optimization , 2018, Comput. Optim. Appl..
[27] Kai-Wei Chang,et al. Distributed block-diagonal approximation methods for regularized empirical risk minimization , 2017, Machine Learning.