Probabilistic Line Searches for Stochastic Optimization
暂无分享,去创建一个
[1] H. H. Rosenbrock,et al. An Automatic Method for Finding the Greatest or Least Value of a Function , 1960, Comput. J..
[2] C. M. Reeves,et al. Function minimization by conjugate gradients , 1964, Comput. J..
[3] L. Armijo. Minimization of functions having Lipschitz continuous first partial derivatives. , 1966 .
[4] P. Wolfe. Convergence Conditions for Ascent Methods. II , 1969 .
[5] R. Fletcher,et al. A New Approach to Variable Metric Algorithms , 1970, Comput. J..
[6] D. Shanno. Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .
[7] D. Goldfarb. A family of variable-metric methods derived by variational means , 1970 .
[8] Larry Nazareth,et al. A family of variable metric updates , 1977, Math. Program..
[9] Bruno O. Shubert,et al. Random variables and stochastic processes , 1979 .
[10] R. Adler,et al. The Geometry of Random Fields , 1982 .
[11] John G. Proakis,et al. Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..
[12] G. Wahba. Spline models for observational data , 1990 .
[13] G. O. Wesolowsky,et al. On the computation of the bivariate normal integral , 1990 .
[14] Ernst Hairer,et al. Solving Ordinary Differential Equations I: Nonstiff Problems , 2009 .
[15] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[16] Donald R. Jones,et al. Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..
[17] Nicol N. Schraudolph,et al. Local Gain Adaptation in Stochastic Gradient Descent , 1999 .
[18] Kenji Fukumizu,et al. Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.
[19] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[20] Tong Zhang,et al. Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.
[21] Steve R. Gunn,et al. Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.
[22] Warren B. Powell,et al. Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming , 2006, Machine Learning.
[23] H. Robbins. A Stochastic Approximation Method , 1951 .
[24] Aarnout Brombacher,et al. Probability... , 2009, Qual. Reliab. Eng. Int..
[25] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[26] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[27] R. Adler. The Geometry of Random Fields , 2009 .
[28] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[29] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[30] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[31] Andrew W. Fitzgibbon,et al. A fast natural Newton method , 2010, ICML.
[32] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[33] Neil D. Lawrence,et al. Fast Variational Inference in the Conjugate Exponential Family , 2012, NIPS.
[34] Philipp Hennig,et al. Fast Probabilistic Optimization from Noisy Gradients , 2013, ICML.
[35] Simo Srkk,et al. Bayesian Filtering and Smoothing , 2013 .
[36] Chong Wang,et al. An Adaptive Learning Rate for Stochastic Variational Inference , 2013, ICML.
[37] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[38] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..
[39] Simo Särkkä,et al. Bayesian Filtering and Smoothing , 2013, Institute of Mathematical Statistics textbooks.
[40] Andre Wibisono,et al. Streaming Variational Bayes , 2013, NIPS.
[41] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[42] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[43] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[44] Siam Rfview,et al. CONVERGENCE CONDITIONS FOR ASCENT METHODS , 2016 .
[45] Samantha Hansen,et al. Using Deep Q-Learning to Control Optimization Hyperparameters , 2016, ArXiv.
[46] Javier Romero,et al. Coupling Adaptive Batch Sizes with Learning Rates , 2016, UAI.
[47] Jitendra Malik,et al. Learning to Optimize , 2016, ICLR.
[48] Maria Huhtala,et al. Random Variables and Stochastic Processes , 2021, Matrix and Tensor Decompositions in Signal Processing.