On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
暂无分享,去创建一个
[1] O. Shamir,et al. Implicit Regularization Towards Rank Minimization in ReLU Networks , 2022, ALT.
[2] Gal Vardi. On the Implicit Bias in Deep-Learning Algorithms , 2022, Commun. ACM.
[3] Ankit Patel,et al. Shallow Univariate ReLU Networks as Splines: Initialization, Loss Surface, Hessian, and Gradient Flow Dynamics , 2020, Frontiers in Artificial Intelligence.
[4] Nathan Srebro,et al. On Margin Maximization in Linear and ReLU Networks , 2021, NeurIPS.
[5] Boris Hanin,et al. Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions , 2021, ArXiv.
[6] Yossi Arjevani,et al. Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks , 2021, NeurIPS.
[7] Nadav Cohen,et al. Continuous vs. Discrete Optimization of Deep Neural Networks , 2021, NeurIPS.
[8] O. Shamir,et al. Implicit Regularization in ReLU Networks with the Square Loss , 2020, COLT.
[9] Gilad Yehudai,et al. The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks , 2020, COLT.
[10] Mert Pilanci,et al. Convex Geometry and Duality of Over-parameterized Neural Networks , 2020, J. Mach. Learn. Res..
[11] Mert Pilanci,et al. Revealing the Structure of Deep Neural Networks via Convex Duality , 2020, ICML.
[12] Aaditya Ramdas,et al. Path Length Bounds for Gradient Descent and Flow , 2019, J. Mach. Learn. Res..
[13] Mary Phuong,et al. The inductive bias of ReLU networks on orthogonally separable data , 2021, ICLR.
[14] Yossi Arjevani,et al. Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry , 2020, NeurIPS.
[15] Matus Telgarsky,et al. Directional convergence and alignment in deep learning , 2020, NeurIPS.
[16] Nadav Cohen,et al. Implicit Regularization in Deep Learning May Not Be Explainable by Norms , 2020, NeurIPS.
[17] Francis Bach,et al. Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss , 2020, COLT.
[18] Matus Telgarsky,et al. Neural tangent kernels, transportation mappings, and universal approximation , 2019, ICLR.
[19] R. Nowak,et al. The Role of Neural Network Activation Functions , 2019, IEEE Signal Processing Letters.
[20] Nathan Srebro,et al. A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case , 2019, ICLR.
[21] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.
[22] Guy Blanc,et al. Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process , 2019, COLT.
[23] Tengyu Ma,et al. Learning Over-Parametrized Two-Layer Neural Networks beyond NTK , 2020, COLT.
[24] Joan Bruna,et al. Gradient Dynamics of Shallow Univariate ReLU Networks , 2019, NeurIPS.
[25] Nathan Srebro,et al. How do infinite width bounded norm networks look in function space? , 2019, COLT.
[26] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[27] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[28] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[29] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[30] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[31] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[32] Sylvain Gelly,et al. Gradient Descent Quantizes ReLU Network Features , 2018, ArXiv.
[33] Ohad Shamir,et al. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.
[34] Behnam Neyshabur,et al. Implicit Regularization in Deep Learning , 2017, ArXiv.
[35] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[36] Amit Daniely,et al. SGD Learns the Conjugate Kernel Class of the Network , 2017, NIPS.
[37] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[38] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[39] Alexandr Andoni,et al. Learning Polynomials with Neural Networks , 2014, ICML.
[40] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[41] Kalyanmoy Deb,et al. Approximate KKT points and a proximity measure for termination , 2013, J. Glob. Optim..
[42] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[43] Yu. S. Ledyaev,et al. Nonsmooth analysis and control theory , 1998 .
[44] D. Owen. Tables for Computing Bivariate Normal Probabilities , 1956 .