Deep learning: a statistical viewpoint
暂无分享,去创建一个
[1] Noureddine El Karoui,et al. The spectrum of kernel random matrices , 2010, 1001.0492.
[2] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.
[3] G. S. Watson,et al. Smooth regression analysis , 1964 .
[4] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.
[5] P. Bartlett,et al. Hardness results for neural network approximation problems , 1999, Theor. Comput. Sci..
[6] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[7] Peter L. Bartlett,et al. Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks , 2017, J. Mach. Learn. Res..
[8] M. Aizerman,et al. Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .
[9] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.
[10] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[11] Zhenyu Liao,et al. A Random Matrix Approach to Neural Networks , 2017, ArXiv.
[12] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[13] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[14] David Mease,et al. Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers , 2015, J. Mach. Learn. Res..
[15] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[16] Sham M. Kakade,et al. A risk comparison of ordinary least squares vs ridge regression , 2011, J. Mach. Learn. Res..
[17] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..