On Margin Maximization in Linear and ReLU Networks
暂无分享,去创建一个
[1] Gal Vardi. On the Implicit Bias in Deep-Learning Algorithms , 2022, Commun. ACM.
[2] O. Shamir,et al. Reconstructing Training Data from Trained Neural Networks , 2022, NeurIPS.
[3] Jason D. Lee,et al. On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias , 2022, NeurIPS.
[4] O. Shamir,et al. Gradient Methods Provably Converge to Non-Robust Networks , 2022, NeurIPS.
[5] O. Shamir,et al. Implicit Regularization Towards Rank Minimization in ReLU Networks , 2022, ALT.
[6] Sanjeev Arora,et al. Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias , 2021, NeurIPS.
[7] Ilya P. Razenshteyn,et al. Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm , 2021, COLT.
[8] Nathan Srebro,et al. On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent , 2021, ICML.
[9] Amir Globerson,et al. Towards Understanding Learning in Neural Networks with Linear Teachers , 2021, ICML.
[10] Kaifeng Lyu,et al. Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning , 2020, ICLR.
[11] O. Shamir,et al. Implicit Regularization in ReLU Networks with the Square Loss , 2020, COLT.
[12] H. Mobahi,et al. A Unifying View on Implicit Bias in Training Linear Neural Networks , 2020, ICLR.
[13] Armin Eftekhari,et al. Implicit Regularization in Matrix Sensing: A Geometric View Leads to Stronger Results , 2020, ArXiv.
[14] Nathan Srebro,et al. Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy , 2020, NeurIPS.
[15] Ohad Shamir,et al. Gradient Methods Never Overfit On Separable Data , 2020, J. Mach. Learn. Res..
[16] Matus Telgarsky,et al. Gradient descent follows the regularization path for general losses , 2020, COLT.
[17] Matus Telgarsky,et al. Directional convergence and alignment in deep learning , 2020, NeurIPS.
[18] Nadav Cohen,et al. Implicit Regularization in Deep Learning May Not Be Explainable by Norms , 2020, NeurIPS.
[19] Mert Pilanci,et al. Convex Geometry and Duality of Over-parameterized Neural Networks , 2020, J. Mach. Learn. Res..
[20] Mert Pilanci,et al. Revealing the Structure of Deep Neural Networks via Convex Duality , 2020, ICML.
[21] Francis Bach,et al. Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss , 2020, COLT.
[22] Mohamed Ali Belabbas,et al. On implicit regularization: Morse functions and applications to matrix factorization , 2020, ArXiv.
[23] Nathan Srebro,et al. Kernel and Rich Regimes in Overparametrized Models , 2019, COLT.
[24] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.
[25] Matus Telgarsky,et al. Characterizing the implicit bias via a primal-dual analysis , 2019, ALT.
[26] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[27] Nathan Srebro,et al. Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models , 2019, ICML.
[28] Francis Bach,et al. Implicit Regularization of Discrete Gradient Dynamics in Deep Linear Neural Networks , 2019, NeurIPS.
[29] Matus Telgarsky,et al. Gradient descent aligns the layers of deep linear networks , 2018, ICLR.
[30] Wei Hu,et al. Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced , 2018, NeurIPS.
[31] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[32] Matus Telgarsky,et al. Risk and parameter convergence of logistic regression , 2018, ArXiv.
[33] Nathan Srebro,et al. Convergence of Gradient Descent on Separable Data , 2018, AISTATS.
[34] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[35] Hongyang Zhang,et al. Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations , 2017, COLT.
[36] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[37] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[38] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).
[39] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[40] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[41] Kalyanmoy Deb,et al. Approximate KKT points and a proximity measure for termination , 2013, J. Glob. Optim..
[42] Mary Phuong,et al. The inductive bias of ReLU networks on orthogonally separable data , 2021, ICLR.
[43] Yuxin Chen,et al. Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion , 2018, ICML.
[44] Ying Xiong. Nonlinear Optimization , 2014 .
[45] Yu. S. Ledyaev,et al. Nonsmooth analysis and control theory , 1998 .