暂无分享,去创建一个
Carlo Fischione | Mikael Skoglund | Hossein Shokri Ghadikolaei | Hadi G. Ghauch | C. Fischione | M. Skoglund | H. Ghauch | H. S. Ghadikolaei
[1] Zhi-Quan Luo,et al. Parallel Successive Convex Approximation for Nonsmooth Nonconvex Optimization , 2014, NIPS.
[2] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[3] Zheng Xu,et al. Training Neural Networks Without Gradients: A Scalable ADMM Approach , 2016, ICML.
[4] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[5] Ting-Kam Leonard Wong,et al. Exponentially concave functions and a new information geometry , 2016, ArXiv.
[6] Nishant Mehta,et al. Fast rates with high probability in exp-concave statistical learning , 2016, AISTATS.
[7] Julien Mairal,et al. Optimization with First-Order Surrogate Functions , 2013, ICML.
[8] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] P. Tseng. Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .
[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[12] Ziming Zhang,et al. Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks , 2017, NIPS.
[13] A. Montanari,et al. The landscape of empirical risk for nonconvex losses , 2016, The Annals of Statistics.
[14] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[15] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[16] Martin J. Wainwright,et al. On the Learnability of Fully-Connected Neural Networks , 2017, AISTATS.
[17] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[18] Simone Scardapane,et al. Parallel and distributed training of neural networks via successive convex approximation , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).
[19] Francisco Facchinei,et al. Parallel Algorithms for Big Data Optimization , 2014, ArXiv.
[20] G. Lewicki,et al. Approximation by Superpositions of a Sigmoidal Function , 2003 .
[21] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[22] Rong Jin,et al. Excess Risk Bounds for Exponentially Concave Losses , 2014, ArXiv.
[23] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[24] Hao Yu,et al. Levenberg—Marquardt Training , 2011 .
[25] Stephen P. Boyd,et al. Proximal Algorithms , 2013, Found. Trends Optim..
[26] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[27] Michael Möller,et al. Proximal Backpropagation , 2017, ICLR.
[28] Suvrit Sra,et al. Global optimality conditions for deep neural networks , 2017, ICLR.
[29] Olvi L. Mangasarian,et al. Backpropagation Convergence via Deterministic Nonmonotone Perturbed Minimization , 1993, NIPS.
[30] Haihao Lu,et al. Depth Creates No Bad Local Minima , 2017, ArXiv.
[31] Yuandong Tian,et al. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis , 2017, ICML.
[32] Zhi-Quan Luo,et al. A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..
[33] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[34] Maria Gabriela Eberle,et al. Finding the closest Toeplitz matrix , 2003 .
[35] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[36] Shai Shalev-Shwartz,et al. Beyond Convexity: Stochastic Quasi-Convex Optimization , 2015, NIPS.
[37] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.