暂无分享,去创建一个
[1] Quynh Nguyen,et al. On Connected Sublevel Sets in Deep Learning , 2019, ICML.
[2] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[5] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[6] Harald Haas,et al. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.
[7] J. Ross Quinlan,et al. Combining Instance-Based and Model-Based Learning , 1993, ICML.
[8] Andrew Gordon Wilson,et al. Averaging Weights Leads to Wider Optima and Better Generalization , 2018, UAI.
[9] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[10] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[11] Chris Eliasmith,et al. Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems , 2004, IEEE Transactions on Neural Networks.
[12] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[13] Zhiqiang Shen,et al. Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[14] Andrew Gordon Wilson,et al. Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs , 2018, NeurIPS.
[15] E. Allgower,et al. Numerical path following , 1997 .
[16] Hans J. Skaug,et al. Efficient Computation of Hessian Matrices in TensorFlow , 2019, ArXiv.
[17] H. Jónsson,et al. Nudged elastic band method for finding minimum energy paths of transitions , 1998 .
[18] Joan Bruna,et al. Spurious Valleys in One-hidden-layer Neural Network Optimization Landscapes , 2019, J. Mach. Learn. Res..
[19] Fred A. Hamprecht,et al. Essentially No Barriers in Neural Network Energy Landscape , 2018, ICML.
[20] J. B. Rosen. The Gradient Projection Method for Nonlinear Programming. Part I. Linear Constraints , 1960 .
[21] R. Fisher. THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .
[22] Yann Dauphin,et al. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.
[23] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[24] Pierre Baldi,et al. The capacity of feedforward neural networks , 2019, Neural Networks.
[25] Daniel Soudry,et al. No bad local minima: Data independent training error guarantees for multilayer neural networks , 2016, ArXiv.
[26] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[27] Ashish Agarwal,et al. TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning , 2019, SysML.
[28] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[29] Surya Ganguli,et al. Exponential expressivity in deep neural networks through transient chaos , 2016, NIPS.
[30] Karl Pearson F.R.S.. LIII. On lines and planes of closest fit to systems of points in space , 1901 .