暂无分享,去创建一个
[1] E. Hlawka. Funktionen von beschränkter Variatiou in der Theorie der Gleichverteilung , 1961 .
[2] H. Niederreiter. Quasi-Monte Carlo methods and pseudo-random numbers , 1978 .
[3] R. Dudley. A course on empirical processes , 1984 .
[4] John Shawe-Taylor,et al. Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.
[5] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[6] R. Ash,et al. Probability and measure theory , 1999 .
[7] V. Koltchinskii,et al. Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.
[8] Ralf Herbrich,et al. Algorithmic Luckiness , 2001, J. Mach. Learn. Res..
[9] E. Novak,et al. The inverse of the star-discrepancy depends linearly on the dimension , 2001 .
[10] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[11] Peter L. Bartlett,et al. Model Selection and Error Estimation , 2000, Machine Learning.
[12] Sayan Mukherjee,et al. Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization , 2006, Adv. Comput. Math..
[13] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[14] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[15] Shie Mannor,et al. Robustness and generalization , 2010, Machine Learning.
[16] Christoph Aistleitner,et al. Covering numbers, dyadic chaining and discrepancy , 2011, J. Complex..
[17] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.
[18] C. Aistleitner,et al. Low-discrepancy point sets for non-uniform measures , 2013, 1308.5049.
[19] Yoshua Bengio,et al. Better Mixing via Deep Representations , 2012, ICML.
[20] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[21] C. Aistleitner,et al. Functions of bounded variation, signed measures, and a general Koksma-Hlawka inequality , 2014, 1406.0230.
[22] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[23] Shin Ishii,et al. Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.
[24] Leslie Pack Kaelbling,et al. Bayesian Optimization with Exponential Convergence , 2015, NIPS.
[25] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[26] R. Tichy,et al. On functions of bounded variation† , 2015, Mathematical Proceedings of the Cambridge Philosophical Society.
[27] Yu Maruyama,et al. Global Continuous Optimization with Error Bound and Fast Convergence , 2016, J. Artif. Intell. Res..
[28] Tolga Tasdizen,et al. Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.
[29] Kenji Kawaguchi,et al. Bounded Optimal Exploration in MDP , 2016, AAAI.
[30] Tegan Maharaj,et al. Deep Nets Don't Learn via Memorization , 2017, ICLR.
[31] Lei Wu,et al. Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes , 2017, ArXiv.
[32] Leslie Pack Kaelbling,et al. Generalization in Deep Learning , 2017, ArXiv.
[33] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[34] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[35] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[36] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[37] Elad Hoffer,et al. Train longer, generalize better: closing the generalization gap in large batch training of neural networks , 2017, NIPS.
[38] Stefano Soatto,et al. Emergence of invariance and disentangling in deep representations , 2017 .
[39] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[40] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.
[41] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.
[42] Graham W. Taylor,et al. Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.
[43] Lorenzo Rosasco,et al. Theory of Deep Learning III: explaining the non-overfitting puzzle , 2017, ArXiv.
[44] Stefano Soatto,et al. Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).
[45] Shai Shalev-Shwartz,et al. SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data , 2017, ICLR.
[46] Elad Hoffer,et al. Exponentially vanishing sub-optimal local minima in multilayer neural networks , 2017, ICLR.
[47] Shie Mannor,et al. Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms , 2016, ICLR.