暂无分享,去创建一个
[1] Allan Pinkus,et al. Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.
[2] Johannes Schmidt-Hieber,et al. A regularity class for the roots of nonnegative functions , 2017 .
[3] Andrew R. Barron,et al. Approximation and estimation bounds for artificial neural networks , 2004, Machine Learning.
[4] Jian Sun,et al. Convolutional neural networks at constrained time cost , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[6] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[7] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[8] R. Srikant,et al. Why Deep Neural Networks for Function Approximation? , 2016, ICLR.
[9] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[10] Jeremy Kepner,et al. Neural Network Topologies for Sparse Training , 2018, 2018 IEEE MIT Undergraduate Research Technology Conference (URTC).
[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] R. Vershynin,et al. A Randomized Kaczmarz Algorithm with Exponential Convergence , 2007, math/0702226.
[13] Masaaki Imaizumi,et al. Adaptive Approximation and Estimation of Deep Neural Network to Intrinsic Dimensionality , 2019, ArXiv.
[14] Razvan Pascanu,et al. On the number of response regions of deep feed forward networks with piece-wise linear activations , 2013, 1312.6098.
[15] Taiji Suzuki,et al. Fast learning rate of deep learning via a kernel perspective , 2017, ArXiv.
[16] Mikhail Belkin,et al. Does data interpolation contradict statistical optimality? , 2018, AISTATS.
[17] Kurt Hornik,et al. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.
[18] Tomaso A. Poggio,et al. Theory II: Landscape of the Empirical Risk in Deep Learning , 2017, ArXiv.
[19] Yannick Baraud,et al. Estimating composite functions by model selection , 2011, 1102.2818.
[20] Joel L. Horowitz,et al. Rate-optimal estimation for a general class of nonparametric regression models with unknown link functions , 2007, 0803.2999.
[21] Peter Stone,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science , 2017, Nature Communications.
[22] I. Daubechies,et al. Wavelets on the Interval and Fast Wavelet Transforms , 1993 .
[23] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[24] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[25] B. Silverman,et al. Nonparametric regression and generalized linear models , 1994 .
[26] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[27] Nathan Srebro,et al. How do infinite width bounded norm networks look in function space? , 2019, COLT.
[28] Razvan Pascanu,et al. On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.
[29] Maxwell B. Stinchcombe,et al. Neural network approximation of continuous functionals and continuous functions on compactifications , 1999, Neural Networks.
[30] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[31] Thomas G. Dietterich,et al. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.
[32] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[33] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[34] Dabal Pedamonti,et al. Comparison of non-linear activation functions for deep neural networks on MNIST classification task , 2018, ArXiv.
[35] T. Poggio,et al. Deep vs. shallow networks : An approximation theory perspective , 2016, ArXiv.
[36] Matus Telgarsky,et al. Benefits of Depth in Neural Networks , 2016, COLT.
[37] Jason M. Klusowski,et al. Uniform Approximation by Neural Networks Activated by First and Second Order Ridge Splines , 2016 .
[38] Guido Montúfar,et al. Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units , 2013, Neural Computation.
[39] Andrew R. Barron,et al. Approximation and Estimation for High-Dimensional Deep Learning Networks , 2018, ArXiv.
[40] Ah Chung Tsoi,et al. Universal Approximation Using Feedforward Neural Networks: A Survey of Some Existing Methods, and Some New Results , 1998, Neural Networks.
[41] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.
[42] Peter Richtárik,et al. Stochastic Dual Ascent for Solving Linear Systems , 2015, ArXiv.
[43] Stergios B. Fotopoulos,et al. All of Nonparametric Statistics , 2007, Technometrics.
[44] Hrushikesh Narhar Mhaskar,et al. Approximation properties of a multilayered feedforward artificial neural network , 1993, Adv. Comput. Math..
[45] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[46] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.
[47] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.
[48] Taiji Suzuki,et al. Fast generalization error bound of deep learning from a kernel perspective , 2018, AISTATS.
[49] E. Candès. New Ties between Computational Harmonic Analysis and Approximation Theory , 2002 .
[50] Anatoli B. Juditsky,et al. NONPARAMETRIC ESTIMATION OF COMPOSITE FUNCTIONS , 2009, 0906.0865.
[51] Dmitry Yarotsky,et al. Optimal approximation of continuous functions by very deep ReLU networks , 2018, COLT.
[52] Subutai Ahmad,et al. How Can We Be So Dense? The Benefits of Using Highly Sparse Representations , 2019, ArXiv.
[53] Adam Gaier,et al. Weight Agnostic Neural Networks , 2019, NeurIPS.
[54] M. Kohler,et al. Nonasymptotic Bounds on the L2 Error of Neural Network Regression Estimates , 2006 .
[55] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[56] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[57] Konstantin Eckle,et al. A comparison of deep networks with ReLU activation function and linear spline-type methods , 2018, Neural Networks.
[58] Michael Kohler,et al. Analysis of the rate of convergence of least squares neural network regression estimates in case of measurement errors , 2011, Neural Networks.
[59] M. Kohler,et al. On deep learning as a remedy for the curse of dimensionality in nonparametric regression , 2019, The Annals of Statistics.
[60] Adam Krzyzak,et al. Adaptive regression estimation with multilayer feedforward neural networks , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..
[61] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[62] Soumendu Sundar Mukherjee,et al. Weak convergence and empirical processes , 2019 .
[63] Jason M. Klusowski,et al. Risk Bounds for High-dimensional Ridge Function Combinations Including Neural Networks , 2016, 1607.01434.
[64] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[65] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[66] Ameya Prabhu,et al. Deep Expander Networks: Efficient Deep Networks from Graph Theory , 2017, ECCV.
[67] Helmut Bölcskei,et al. Optimal Approximation with Sparsely Connected Deep Neural Networks , 2017, SIAM J. Math. Data Sci..
[68] Dmitry Yarotsky,et al. Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.
[69] Peter Richtárik,et al. Randomized Iterative Methods for Linear Systems , 2015, SIAM J. Matrix Anal. Appl..
[70] V. Koltchinskii,et al. Concentration inequalities and asymptotic results for ratio type empirical processes , 2006, math/0606788.
[71] P. Massart,et al. Concentration inequalities and model selection , 2007 .
[72] Johannes Schmidt-Hieber,et al. Deep ReLU network approximation of functions on a manifold , 2019, ArXiv.
[73] Daniel F. McCaffrey,et al. Convergence rates for single hidden layer feedforward networks , 1994, Neural Networks.
[74] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[75] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[76] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.
[77] G. Kerkyacharian,et al. Minimax or maxisets , 2002 .
[78] Adam Krzyżak,et al. Nonparametric Regression Based on Hierarchical Interaction Models , 2017, IEEE Transactions on Information Theory.