暂无分享,去创建一个
Leslie Pack Kaelbling | Yoshua Bengio | Kenji Kawaguchi | Yoshua Bengio | L. Kaelbling | Kenji Kawaguchi
[1] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[2] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.
[3] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[4] John Shawe-Taylor,et al. Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.
[5] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[6] V. Koltchinskii,et al. Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.
[7] Nikola S. Nikolov,et al. How to Layer a Directed Acyclic Graph , 2001, GD.
[8] Ralf Herbrich,et al. Algorithmic Luckiness , 2001, J. Mach. Learn. Res..
[9] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[10] V. Koltchinskii,et al. Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.
[11] Peter L. Bartlett,et al. Model Selection and Error Estimation , 2000, Machine Learning.
[12] Sayan Mukherjee,et al. Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization , 2006, Adv. Comput. Math..
[13] Gavin C. Cawley,et al. On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..
[14] Shie Mannor,et al. Robustness and generalization , 2010, Machine Learning.
[15] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..
[16] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.
[17] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[18] Razvan Pascanu,et al. On the number of response regions of deep feed forward networks with piece-wise linear activations , 2013, 1312.6098.
[19] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[20] Benjamin Graham,et al. Fractional Max-Pooling , 2014, ArXiv.
[21] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[22] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[23] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[24] Joel A. Tropp,et al. An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..
[25] Kensuke Yokoi,et al. APAC: Augmented PAttern Classification with Neural Networks , 2015, ArXiv.
[26] Ruslan Salakhutdinov,et al. Path-SGD: Path-Normalized Optimization in Deep Neural Networks , 2015, NIPS.
[27] Pengtao Xie,et al. On the Generalization Error Bounds of Neural Networks under Diversity-Inducing Mutual Angular Regularization , 2015, ArXiv.
[28] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[29] Leslie Pack Kaelbling,et al. Bayesian Optimization with Exponential Convergence , 2015, NIPS.
[30] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[31] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[32] Ohad Shamir,et al. On the Quality of the Initial Basin in Overspecified Neural Networks , 2015, ICML.
[33] Matus Telgarsky,et al. Benefits of Depth in Neural Networks , 2016, COLT.
[34] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[35] Kenji Kawaguchi,et al. Bounded Optimal Exploration in MDP , 2016, AAAI.
[36] Tie-Yan Liu,et al. On the Depth of Deep Neural Networks: A Theoretical View , 2015, AAAI.
[37] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.
[38] Tegan Maharaj,et al. Deep Nets Don't Learn via Memorization , 2017, ICLR.
[39] Zhanxing Zhu,et al. Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes , 2017, ArXiv.
[40] Shai Shalev-Shwartz,et al. Fast Rates for Empirical Risk Minimization of Strict Saddle Problems , 2017, COLT.
[41] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[42] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[43] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[44] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[45] T. Poggio,et al. Memo No . 067 June 27 , 2017 Theory of Deep Learning III : Generalization Properties of SGD , 2017 .
[46] Elad Hoffer,et al. Train longer, generalize better: closing the generalization gap in large batch training of neural networks , 2017, NIPS.
[47] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[48] Guillermo Sapiro,et al. Generalization Error of Invariant Classifiers , 2016, AISTATS.
[49] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[50] Guillermo Sapiro,et al. Robust Large Margin Deep Neural Networks , 2016, IEEE Transactions on Signal Processing.
[51] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.
[53] Yoshua Bengio,et al. Towards Understanding Generalization via Analytical Learning Theory , 2018, 1802.07426.
[54] Suvrit Sra,et al. Global optimality conditions for deep neural networks , 2017, ICLR.
[55] Christoph H. Lampert,et al. Data-Dependent Stability of Stochastic Gradient Descent , 2017, ICML.