暂无分享,去创建一个
[1] Kenneth Levenberg. A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .
[2] D. Marquardt. An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .
[3] J. Nocedal. Updating Quasi-Newton Matrices With Limited Storage , 1980 .
[4] P. McCullagh,et al. Generalized Linear Models , 1984 .
[5] Jorge Nocedal,et al. On the limited memory BFGS method for large scale optimization , 1989, Math. Program..
[6] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .
[7] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .
[8] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[9] J. C. BurgesChristopher. A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .
[10] L. Eon Bottou. Online Learning and Stochastic Approximations , 1998 .
[11] Léon Bottou,et al. On-line learning and stochastic approximations , 1999 .
[12] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[13] E. Haber,et al. On optimization techniques for solving nonlinear inverse problems , 2000 .
[14] Alexander J. Smola,et al. Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.
[15] Olvi L. Mangasarian,et al. A finite newton method for classification , 2002, Optim. Methods Softw..
[16] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.
[17] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[18] Christopher J. C. Burges,et al. A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.
[19] S. Sathiya Keerthi,et al. A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..
[20] Rong Jin,et al. Distance Metric Learning: A Comprehensive Survey , 2006 .
[21] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[22] Petros Drineas,et al. Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication , 2006, SIAM J. Comput..
[23] H. Robbins. A Stochastic Approximation Method , 1951 .
[24] Olivier Chapelle,et al. Training a Support Vector Machine in the Primal , 2007, Neural Computation.
[25] Chih-Jen Lin,et al. Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..
[26] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[27] D. Bernstein. Matrix Mathematics: Theory, Facts, and Formulas , 2009 .
[28] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[29] S. V. N. Vishwanathan,et al. A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning , 2008, J. Mach. Learn. Res..
[30] Vincent Nesme,et al. Note on sampling without replacing from a finite collection of matrices , 2010, ArXiv.
[31] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[32] U. Ascher,et al. Adaptive and stochastic algorithms for EIT and DC resistivity problems with piecewise constant solutions and many measurements , 2011 .
[33] Jorge Nocedal,et al. On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..
[34] Michael W. Mahoney. Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..
[35] Jacek Gondzio,et al. Exploiting separability in large-scale linear support vector machine training , 2011, Comput. Optim. Appl..
[36] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..
[37] Ohad Shamir,et al. Better Mini-Batch Algorithms via Accelerated Gradient Methods , 2011, NIPS.
[38] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.
[39] Eldad Haber,et al. An Effective Method for Parameter Estimation with PDE Constraints with Multiple Right-Hand Sides , 2012, SIAM J. Optim..
[40] Felix J. Herrmann,et al. Robust inversion, dimensionality reduction, and randomized sampling , 2012, Math. Program..
[41] Mark W. Schmidt,et al. Hybrid Deterministic-Stochastic Methods for Data Fitting , 2011, SIAM J. Sci. Comput..
[42] Jorge Nocedal,et al. Sample size selection in optimization methods for machine learning , 2012, Math. Program..
[43] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..
[44] Georg Heigold,et al. An empirical study of learning rates in deep neural networks for speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[45] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[46] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.
[47] E. Haber,et al. The lost honor of ℓ2-based regularization , 2013 .
[48] Uri M. Ascher,et al. Data completion and stochastic algorithms for PDE inversion problems with many measurements , 2013, ArXiv.
[49] Michael I. Jordan,et al. Matrix concentration inequalities via the method of exchangeable pairs , 2012, 1201.6002.
[50] M. Girolami,et al. Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo , 2014, 1407.1517.
[51] Uri M. Ascher,et al. Stochastic Algorithms for Inverse Problems Involving PDEs and many Measurements , 2014, SIAM J. Sci. Comput..
[52] Eldad Haber,et al. Simultaneous Source for non-uniform data variance and missing data , 2014, ArXiv.
[53] Alexander J. Smola,et al. Efficient mini-batch training for stochastic optimization , 2014, KDD.
[54] Uri M. Ascher,et al. Assessing stochastic algorithms for large scale nonlinear least squares problems using extremal probabilities of linear combinations of gamma random variables , 2014, SIAM/ASA J. Uncertain. Quantification.
[55] Joel A. Tropp,et al. An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..
[56] Andrea Montanari,et al. Convergence rates of sub-sampled Newton methods , 2015, NIPS.
[57] Ohad Shamir,et al. Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity , 2015, ICML.
[58] Michael W. Mahoney,et al. Sub-Sampled Newton Methods I: Globally Convergent Algorithms , 2016, ArXiv.
[59] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[60] Martin J. Wainwright,et al. Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence , 2015, SIAM J. Optim..