暂无分享,去创建一个
Samy Bengio | Oriol Vinyals | Benjamin Recht | Moritz Hardt | Chiyuan Zhang | Oriol Vinyals | Samy Bengio | B. Recht | Chiyuan Zhang | Moritz Hardt | O. Vinyals
[1] Hrushikesh Narhar Mhaskar,et al. Approximation properties of a multilayered feedforward artificial neural network , 1993, Adv. Comput. Math..
[2] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[3] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[4] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.
[5] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[6] T. Poggio,et al. Statistical Learning: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization , 2002 .
[7] T. Poggio,et al. General conditions for predictivity in learning theory , 2004, Nature.
[8] Y. Yao,et al. On Early Stopping in Gradient Descent Learning , 2007 .
[9] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[10] Ohad Shamir,et al. Learnability, Stability and Uniform Convergence , 2010, J. Mach. Learn. Res..
[11] Yoshua Bengio,et al. Shallow vs. Deep Sum-Product Networks , 2011, NIPS.
[12] Andrew Y. Ng,et al. Learning Feature Representations with K-Means , 2012, Neural Networks: Tricks of the Trade.
[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[14] J. T. Spooner,et al. Adaptive and Learning Systems for Signal Processing, Communications, and Control , 2006 .
[15] D. Costarelli,et al. Constructive Approximation by Superposition of Sigmoidal Functions , 2013 .
[16] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[17] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[18] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[19] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[20] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[21] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[22] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[23] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[24] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Matus Telgarsky,et al. Benefits of Depth in Neural Networks , 2016, COLT.
[27] Lorenzo Rosasco,et al. Generalization Properties and Implicit Regularization for Multiple Passes SGM , 2016, ICML.
[28] Amnon Shashua,et al. Convolutional Rectifier Networks as Generalized Tensor Decompositions , 2016, ICML.
[29] T. Poggio,et al. Deep vs. shallow networks : An approximation theory perspective , 2016, ArXiv.
[30] Ohad Shamir,et al. The Power of Depth for Feedforward Neural Networks , 2015, COLT.
[31] Omer Levy,et al. Published as a conference paper at ICLR 2018 S IMULATING A CTION D YNAMICS WITH N EURAL P ROCESS N ETWORKS , 2018 .