暂无分享,去创建一个
[1] Dmitry Yarotsky,et al. Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.
[2] Harold R. Parks,et al. A Primer of Real Analytic Functions , 1992 .
[3] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[4] Matus Telgarsky,et al. Representation Benefits of Deep Feedforward Networks , 2015, ArXiv.
[5] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[8] Mahdi Soltanolkotabi,et al. Learning ReLUs via Gradient Descent , 2017, NIPS.
[9] Yoshua Bengio,et al. Understanding intermediate layers using linear classifier probes , 2016, ICLR.
[10] Yuandong Tian,et al. When is a Convolutional Filter Easy To Learn? , 2017, ICLR.
[11] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[12] Elad Hoffer,et al. Exponentially vanishing sub-optimal local minima in multilayer neural networks , 2017, ICLR.
[13] Tengyu Ma,et al. Identity Matters in Deep Learning , 2016, ICLR.
[14] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[15] Hod Lipson,et al. Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.
[16] Yoshua Bengio,et al. Shallow vs. Deep Sum-Product Networks , 2011, NIPS.
[17] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[18] Matus Telgarsky,et al. Benefits of Depth in Neural Networks , 2016, COLT.
[19] Razvan Pascanu,et al. On the number of response regions of deep feed forward networks with piece-wise linear activations , 2013, 1312.6098.
[20] Ohad Shamir,et al. Distribution-Specific Hardness of Learning Neural Networks , 2016, J. Mach. Learn. Res..
[21] Oriol Vinyals,et al. Qualitatively characterizing neural network optimization problems , 2014, ICLR.
[22] Amnon Shashua,et al. Convolutional Rectifier Networks as Generalized Tensor Decompositions , 2016, ICML.
[23] Yuanzhi Li,et al. Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.
[24] T. Poggio,et al. Deep vs. shallow networks : An approximation theory perspective , 2016, ArXiv.
[25] Andrea Vedaldi,et al. Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[27] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Surya Ganguli,et al. On the Expressive Power of Deep Neural Networks , 2016, ICML.
[29] B. Mityagin. The Zero Set of a Real Analytic Function , 2015, Mathematical Notes.
[30] Anima Anandkumar,et al. Provable Methods for Training Neural Networks with Sparse Connectivity , 2014, ICLR.
[31] Joan Bruna,et al. Topology and Geometry of Half-Rectified Network Optimization , 2016, ICLR.
[32] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[33] Ohad Shamir,et al. Failures of Gradient-Based Deep Learning , 2017, ICML.
[34] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.
[35] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[36] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[38] Adam R. Klivans,et al. Learning Depth-Three Neural Networks in Polynomial Time , 2017, ArXiv.
[39] Ohad Shamir,et al. The Power of Depth for Feedforward Neural Networks , 2015, COLT.
[40] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[41] Matthias Hein,et al. The Loss Surface of Deep and Wide Neural Networks , 2017, ICML.
[42] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[43] Yuandong Tian,et al. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis , 2017, ICML.
[44] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[45] Yann LeCun,et al. Open Problem: The landscape of the loss surfaces of multilayer networks , 2015, COLT.
[46] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[48] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[49] Zhenghao Chen,et al. On Random Weights and Unsupervised Feature Learning , 2011, ICML.
[50] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[51] Ohad Shamir,et al. Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks , 2016, ICML.
[52] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[53] Suvrit Sra,et al. Global optimality conditions for deep neural networks , 2017, ICLR.
[54] Mohammed Bennamoun,et al. How Can Deep Rectifier Networks Achieve Linear Separability and Preserve Distances? , 2015, ICML.
[55] Eugenio Culurciello,et al. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.
[56] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[57] Max Jaderberg,et al. Understanding Synthetic Gradients and Decoupled Neural Interfaces , 2017, ICML.
[58] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[59] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[60] Peter Auer,et al. Exponentially many local minima for single neurons , 1995, NIPS.
[61] Quynh N. Nguyen,et al. Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods , 2016, NIPS.
[62] R. Srikant,et al. Why Deep Neural Networks for Function Approximation? , 2016, ICLR.
[63] N. V. Dang. Complex powers of analytic functions and meromorphic renormalization in QFT , 2015, 1503.00995.
[64] René Vidal,et al. Global Optimality in Tensor Factorization, Deep Learning, and Beyond , 2015, ArXiv.
[65] Alexandr Andoni,et al. Learning Polynomials with Neural Networks , 2014, ICML.
[66] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[67] Jirí Síma,et al. Training a Single Sigmoidal Neuron Is Hard , 2002, Neural Comput..