暂无分享,去创建一个
Wenjia Wang | Guang Cheng | Cong Lin | Tianyang Hu | Guang Cheng | Wenjia Wang | Tianyang Hu | Cong Lin
[1] V. G. Troitsky,et al. Journal of Mathematical Analysis and Applications , 1960 .
[2] G. Wahba,et al. Some results on Tchebycheffian spline functions , 1971 .
[3] W. W. Daniel. Applied Nonparametric Statistics , 1979 .
[4] C. J. Stone,et al. Optimal Global Rates of Convergence for Nonparametric Regression , 1982 .
[5] S. Shott,et al. Nonparametric Statistics , 2018, The Encyclopedia of Archaeological Sciences.
[6] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[7] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[8] Rich Caruana,et al. Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.
[9] S. Geer. Empirical Processes in M-Estimation , 2000 .
[10] R. Varga. Geršgorin And His Circles , 2004 .
[11] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[12] Martin J. Wainwright,et al. Early stopping for non-parametric regression: An optimal data-dependent stopping rule , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[13] Lutz Prechelt,et al. Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.
[14] K. Atkinson,et al. Spherical Harmonics and Approximations on the Unit Sphere: An Introduction , 2012 .
[15] S. Geer. On the uniform convergence of empirical norms and inner products, with application to causal inference , 2013, 1310.5523.
[16] J. Dick,et al. A Characterization of Sobolev Spaces on the Sphere and an Extension of Stolarsky’s Invariance Principle to Arbitrary Smoothness , 2012, 1203.5157.
[17] C. Frye,et al. Spherical Harmonics in p Dimensions , 2012, 1205.3548.
[18] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[19] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[20] Kawin Setsompop,et al. Fast image reconstruction with L2‐regularization , 2013, Journal of magnetic resonance imaging : JMRI.
[21] Jing Wang,et al. Entropy numbers of Besov classes of generalized smoothness on the sphere , 2014 .
[22] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[23] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[24] Ming Yuan,et al. Minimax Optimal Rates of Estimation in High Dimensional Additive Models: Universal Phase Transition , 2015, ArXiv.
[25] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Ekachai Phaisangittisagul,et al. An Analysis of the Regularization Between L2 and Dropout in Single Hidden Layer Neural Network , 2016, 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS).
[27] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[28] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.
[29] Johannes Schmidt-Hieber,et al. Nonparametric regression using deep neural networks with ReLU activation function , 2017, The Annals of Statistics.
[30] Dmitry Yarotsky,et al. Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.
[31] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[32] Twan van Laarhoven,et al. L2 Regularization versus Batch and Weight Normalization , 2017, ArXiv.
[33] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[34] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[35] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[36] A. Krzyżak,et al. Over-parametrized deep neural networks do not generalize well , 2019, 1912.03925.
[37] Quanquan Gu,et al. An Improved Analysis of Training Over-parameterized Deep Neural Networks , 2019, NeurIPS.
[38] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[39] Tri Dao,et al. A Kernel Theory of Modern Data Augmentation , 2018, ICML.
[40] Julien Mairal,et al. On the Inductive Bias of Neural Tangent Kernels , 2019, NeurIPS.
[41] M. Kohler,et al. On deep learning as a remedy for the curse of dimensionality in nonparametric regression , 2019, The Annals of Statistics.
[42] Kenji Fukumizu,et al. Deep Neural Networks Learn Non-Smooth Functions Effectively , 2018, AISTATS.
[43] Matus Telgarsky,et al. The implicit bias of gradient descent on nonseparable data , 2019, COLT.
[44] Colin Wei,et al. Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel , 2018, NeurIPS.
[45] Taiji Suzuki,et al. Refined Generalization Analysis of Gradient Descent for Over-parameterized Two-layer Neural Networks with Smooth Activations on Classification Problems , 2019, ArXiv.
[46] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[47] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[48] Ruiqi Liu,et al. Optimal Nonparametric Inference via Deep Neural Network , 2019, ArXiv.
[49] Matus Telgarsky,et al. Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks , 2019, ICLR.
[50] Zhiyuan Li,et al. Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee , 2019, ICLR.
[51] Quanquan Gu,et al. Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks , 2019, AAAI.
[52] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.
[53] R. Basri,et al. On the Similarity between the Laplace and Neural Tangent Kernels , 2020, NeurIPS.
[54] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[55] Mountain View,et al. On the training dynamics of deep networks with L2 regularization , 2020 .
[56] Lin Chen,et al. Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS , 2020, ICLR.
[57] Taiji Suzuki,et al. Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime , 2020, ICLR.
[58] Yuan Cao,et al. Towards Understanding the Spectral Bias of Deep Learning , 2019, IJCAI.