Characterizing the implicit bias via a primal-dual analysis
暂无分享,去创建一个
[1] Matus Telgarsky,et al. Risk and parameter convergence of logistic regression , 2018, ArXiv.
[2] Manfred K. Warmuth,et al. Boosting as entropy projection , 1999, COLT '99.
[3] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[4] Yoav Freund,et al. Boosting: Foundations and Algorithms , 2012 .
[5] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[6] Cynthia Rudin,et al. The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins , 2004, J. Mach. Learn. Res..
[7] Nathan Srebro,et al. Convergence of Gradient Descent on Separable Data , 2018, AISTATS.
[8] Adrian S. Lewis,et al. Convex Analysis And Nonlinear Optimization , 2000 .
[9] Shai Shalev-Shwartz,et al. Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .
[10] J. Hiriart-Urruty,et al. Fundamentals of Convex Analysis , 2004 .
[11] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.
[12] Yoram Singer,et al. Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.
[13] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[14] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[15] Matus Telgarsky,et al. Gradient descent follows the regularization path for general losses , 2020, COLT.
[16] Paul Grigas,et al. AdaBoost and Forward Stagewise Regression are First-Order Convex Optimization Methods , 2013, ArXiv.
[17] Yoram Singer,et al. On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms , 2010, Machine Learning.
[18] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[19] Matus Telgarsky,et al. Gradient descent aligns the layers of deep linear networks , 2018, ICLR.
[20] Kaifeng Lyu,et al. Gradient Descent Maximizes the Margin of Homogeneous Neural Networks , 2019, ICLR.
[21] David P. Woodruff,et al. Sublinear Optimization for Machine Learning , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.
[22] Francis Bach,et al. Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss , 2020, COLT.
[23] Nathan Srebro,et al. Kernel and Rich Regimes in Overparametrized Models , 2019, COLT.
[24] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[25] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[26] Matus Telgarsky,et al. Directional convergence and alignment in deep learning , 2020, NeurIPS.
[27] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.
[28] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[29] Matus Telgarsky,et al. Margins, Shrinkage, and Boosting , 2013, ICML.