暂无分享,去创建一个
[1] Sylvain Gelly,et al. Gradient Descent Quantizes ReLU Network Features , 2018, ArXiv.
[2] Julien Mairal,et al. On the Inductive Bias of Neural Tangent Kernels , 2019, NeurIPS.
[3] W. Rudin. Principles of mathematical analysis , 1964 .
[4] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[5] Andrew Y. Ng,et al. Learning Feature Representations with K-Means , 2012, Neural Networks: Tricks of the Trade.
[6] Francis Bach,et al. A Note on Lazy Training in Supervised Differentiable Programming , 2018, ArXiv.
[7] Glenn Fung,et al. Equivalence of Minimal ℓ0- and ℓp-Norm Solutions of Linear Equalities, Inequalities and Linear Programs for Sufficiently Small p , 2011, J. Optim. Theory Appl..
[8] Nathan Srebro,et al. How do infinite width bounded norm networks look in function space? , 2019, COLT.
[9] Prasad Raghavendra,et al. Hardness of Learning Halfspaces with Noise , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[10] Robert D. Nowak,et al. Minimum "Norm" Neural Networks are Splines , 2019, ArXiv.
[11] Joan Bruna,et al. Gradient Dynamics of Shallow Univariate ReLU Networks , 2019, NeurIPS.
[12] M. Talagrand,et al. Probability in Banach Spaces: Isoperimetry and Processes , 1991 .
[13] Guy Blanc,et al. Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process , 2019, COLT.
[14] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[15] Y. Gordon. On Milman's inequality and random subspaces which escape through a mesh in ℝ n , 1988 .
[16] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[17] 丸山 徹. Convex Analysisの二,三の進展について , 1977 .
[18] Panagiotis Patrinos,et al. SuperMann: A Superlinearly Convergent Algorithm for Finding Fixed Points of Nonexpansive Operators , 2016, IEEE Transactions on Automatic Control.
[19] Radford M. Neal. Priors for Infinite Networks , 1996 .
[20] D. Donoho. For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .
[21] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.
[22] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[23] M. A. López-Cerdá,et al. Linear Semi-Infinite Optimization , 1998 .
[24] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[25] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[26] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[27] Qiang Liu,et al. On the Margin Theory of Feedforward Neural Networks , 2018, ArXiv.
[28] Nicolas Le Roux,et al. Convex Neural Networks , 2005, NIPS.
[29] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[30] Peng Zhao,et al. On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..
[31] Colin Wei,et al. Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel , 2018, NeurIPS.
[32] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[33] Richard E. Turner,et al. Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.
[34] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[35] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[36] Oskar Maria Baksalary,et al. Particular formulae for the Moore-Penrose inverse of a columnwise partitioned matrix , 2007 .