Approximation by Combinations of ReLU and Squared ReLU Ridge Functions With $\ell^1$ and $\ell^0$ Controls
暂无分享,去创建一个
[1] Jason M. Klusowski,et al. Risk Bounds for High-dimensional Ridge Function Combinations Including Neural Networks , 2016, 1607.01434.
[2] Martin J. Wainwright,et al. Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.
[3] Y. Makovoz. Uniform Approximation by Neural Networks , 1998 .
[4] David Haussler,et al. Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.
[5] Andrew R. Barron,et al. Approximation and estimation bounds for artificial neural networks , 2004, Machine Learning.
[6] Y. Makovoz. Random Approximants and Neural Networks , 1996 .
[7] Andrew R. Barron,et al. Minimax lower bounds for ridge combinations including neural nets , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[8] Marcello Sanguineti,et al. Estimates of covering numbers of convex sets with slowly decaying orthogonal subsets , 2007, Discret. Appl. Math..
[9] Martin J. Wainwright,et al. Learning Halfspaces and Neural Networks with Random Initialization , 2015, ArXiv.
[10] Jon A. Wellner,et al. Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .
[11] Andrew R. Barron,et al. A Better Approximation for Balls , 2000 .
[12] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[13] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[14] Gábor Lugosi,et al. Concentration Inequalities , 2008, COLT.
[15] J. Neyman. On the Two Different Aspects of the Representative Method: the Method of Stratified Sampling and the Method of Purposive Selection , 1934 .
[16] Stratis Ioannidis,et al. Learning Combinations of Sigmoids Through Gradient Estimation , 2017, ArXiv.
[17] Leo Breiman,et al. Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.
[18] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[19] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[20] Halbert White,et al. Sup-norm approximation bounds for networks through probabilistic methods , 1995, IEEE Trans. Inf. Theory.
[21] Soumendu Sundar Mukherjee,et al. Weak convergence and empirical processes , 2019 .