暂无分享,去创建一个
[1] Surya Ganguli,et al. On the saddle point problem for non-convex optimization , 2014, ArXiv.
[2] Ching-Piao Tsai,et al. BACK-PROPAGATION NEURAL NETWORK IN TIDAL-LEVEL FORECASTING , 2001 .
[3] Jie Liu,et al. SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.
[4] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[5] Philippe L. Toint,et al. Towards an efficient sparsity exploiting newton method for minimization , 1981 .
[6] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[7] M. C. Deo,et al. Neural networks for wave forecasting , 2001 .
[8] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[9] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[10] Haiyan Lu,et al. Multi-step forecasting for wind speed using a modified EMD-based artificial neural network model , 2012 .
[11] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[12] Michael W. Mahoney,et al. Sub-Sampled Newton Methods I: Globally Convergent Algorithms , 2016, ArXiv.
[13] Katya Scheinberg,et al. Convergence of Trust-Region Methods Based on Probabilistic Models , 2013, SIAM J. Optim..
[14] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[15] Katya Scheinberg,et al. Stochastic optimization using a trust-region method and random models , 2015, Mathematical Programming.
[16] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[17] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[18] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[19] Saeed Ghadimi,et al. Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.
[20] G C Lee,et al. NEURAL NETWORKS TRAINED BY ANALYTICALLY SIMULATED DAMAGE STATES , 1993 .
[21] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[22] Peng Xu,et al. Sub-sampled Newton Methods with Non-uniform Sampling , 2016, NIPS.
[23] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[24] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[25] Guanghui Lan,et al. An optimal method for stochastic composite optimization , 2011, Mathematical Programming.
[26] Jorge Nocedal,et al. A Stochastic Quasi-Newton Method for Large-Scale Optimization , 2014, SIAM J. Optim..
[27] J. Blanchet,et al. Convergence Rate Analysis of a Stochastic Trust Region Method for Nonconvex Optimization , 2016 .
[28] Raghu Pasupathy,et al. Simulation Optimization: A Concise Overview and Implementation Guide , 2013 .
[29] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[30] Hojjat Adeli,et al. Neural Networks in Civil Engineering: 1989–2000 , 2001 .
[31] Stuart E. Dreyfus,et al. Second-order stagewise backpropagation for Hessian-matrix analyses and investigation of negative curvature , 2008, Neural Networks.
[32] Nicholas I. M. Gould,et al. Solving the Trust-Region Subproblem using the Lanczos Method , 1999, SIAM J. Optim..
[33] Ilya Sutskever,et al. Training Deep and Recurrent Networks with Hessian-Free Optimization , 2012, Neural Networks: Tricks of the Trade.
[34] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[35] Tong Zhang,et al. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.
[36] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[38] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[39] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[40] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[41] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[42] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[43] Stephen J. Wright. Optimization algorithms for data analysis , 2018, IAS/Park City Mathematics Series.
[44] Dong Yu,et al. Deep Learning and Its Applications to Signal and Information Processing , 2011 .
[45] Dong Yu,et al. Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP] , 2011, IEEE Signal Processing Magazine.
[46] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[47] L. N. Vicente,et al. Complexity and global rates of trust-region methods based on probabilistic models , 2018 .
[48] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[49] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[50] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[51] James H. Garrett,et al. Artificial Neural Networks for Civil Engineers: Fundamentals and Applications , 1997 .
[52] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[53] S. Ashhab,et al. Fully connected network of superconducting qubits in a cavity , 2008, 0802.1469.
[54] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[55] G. Gnecco,et al. Approximation Error Bounds via Rademacher's Complexity , 2008 .
[56] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[57] Patrick Gallinari,et al. SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..
[58] Seyedalireza Yektamaram,et al. Optimization Algorithms for Machine Learning Designed for Parallel and Distributed Environments , 2018 .
[59] Jorge Nocedal,et al. A Multi-Batch L-BFGS Method for Machine Learning , 2016, NIPS.
[60] Andrea Montanari,et al. Convergence rates of sub-sampled Newton methods , 2015, NIPS.
[61] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .
[62] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[63] J.A. Anderson,et al. Neurocomputing: Foundations of Research@@@Neurocomputing 2: Directions for Research , 1992 .
[64] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[65] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[66] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[67] Alexandre d'Aspremont,et al. Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .
[68] Jorge Nocedal,et al. Representations of quasi-Newton matrices and their use in limited memory methods , 1994, Math. Program..
[69] C. P. Sheppard,et al. Predicting time series by a fully connected neural network trained by back propagation , 1992 .
[70] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[71] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[72] Simon Günter,et al. A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.
[73] Michael W. Mahoney,et al. Sub-Sampled Newton Methods II: Local Convergence Rates , 2016, ArXiv.
[74] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[75] Robert M. Gower,et al. Stochastic Block BFGS: Squeezing More Curvature out of Data , 2016, ICML.
[76] Katya Scheinberg,et al. Global convergence rate analysis of unconstrained optimization methods based on probabilistic models , 2015, Mathematical Programming.
[77] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[78] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[79] James C. Spall,et al. Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .
[80] J. Nocedal,et al. Exact and Inexact Subsampled Newton Methods for Optimization , 2016, 1609.08502.
[81] Stephen J. Wright,et al. Numerical Optimization (Springer Series in Operations Research and Financial Engineering) , 2000 .
[82] T. Steihaug. The Conjugate Gradient Method and Trust Regions in Large Scale Optimization , 1983 .
[83] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..
[84] H. Robbins. A Stochastic Approximation Method , 1951 .
[85] Martin T. Hagan,et al. Neural network design , 1995 .
[86] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[87] Frank E. Curtis,et al. A Self-Correcting Variable-Metric Algorithm for Stochastic Optimization , 2016, ICML.
[88] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[89] Ian Flood,et al. Neural Networks in Civil Engineering. I: Principles and Understanding , 1994 .
[90] Mark W. Schmidt,et al. Hybrid Deterministic-Stochastic Methods for Data Fitting , 2011, SIAM J. Sci. Comput..
[91] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[92] K. Thirumalaiah,et al. River Stage Forecasting Using Artificial Neural Networks , 1998 .
[93] Herbert Jaeger,et al. Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..
[94] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..