暂无分享,去创建一个
Amin Karbasi | Kartik K. Sreenivasan | Dimitris Papailiopoulos | Shashank Rajput | Kartik Sreenivasan | Dimitris Papailiopoulos | Amin Karbasi | Shashank Rajput
[1] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[2] Yih-Fang Huang,et al. Bounds on the number of hidden neurons in multilayer perceptrons , 1991, IEEE Trans. Neural Networks.
[3] Panos J. Antsaklis,et al. A simple method to derive bounds on the size and to train multilayer neural networks , 1991, IEEE Trans. Neural Networks.
[4] Guang-Bin Huang,et al. Learning capability and storage capacity of two-hidden-layer feedforward networks , 2003, IEEE Trans. Neural Networks.
[5] Roman Vershynin,et al. Memory Capacity of Neural Networks with Threshold and Rectified Linear Unit Activations , 2020, SIAM J. Math. Data Sci..
[6] Jinwoo Shin,et al. Provable Memorization via Deep Neural Networks using Sub-linear Parameters , 2020, COLT.
[7] Yann LeCun,et al. Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks , 2018, ArXiv.
[8] Adam Kowalczyk,et al. Estimates of Storage Capacity of Multilayer Perceptron with Threshold Logic Hidden Units , 1997, Neural Networks.
[9] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[10] Suvrit Sra,et al. Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity , 2018, NeurIPS.
[11] Eduardo D. Sontag,et al. Remarks on Interpolation and Recognition Using Neural Nets , 1990, NIPS.
[12] A. Batyuk,et al. Bithreshold Neural Network Classifier , 2020, 2020 IEEE 15th International Conference on Computer Sciences and Information Technologies (CSIT).
[13] Mikhail Belkin,et al. Two models of double descent for weak features , 2019, SIAM J. Math. Data Sci..
[14] R. Durbin,et al. Bounds on the learning capacity of some multi-layer networks , 1989, Biological Cybernetics.
[15] Ronen Eldan,et al. Network size and weights size for memorization with two-layers neural networks , 2020, ArXiv.
[16] O. Papaspiliopoulos. High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .
[17] K. Ball. An Elementary Introduction to Modern Convex Geometry , 1997 .
[18] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[19] Thomas M. Cover,et al. Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..
[20] Eric B. Baum,et al. On the capabilities of multilayer perceptrons , 1988, J. Complex..
[21] Dimitris Achlioptas,et al. Bad Global Minima Exist and SGD Can Reach Them , 2019, NeurIPS.
[22] L. Gordon,et al. Tutorial on large deviations for the binomial distribution. , 1989, Bulletin of mathematical biology.
[23] Tengyu Ma,et al. Identity Matters in Deep Learning , 2016, ICLR.