Revealing the Structure of Deep Neural Networks via Convex Duality
暂无分享,去创建一个
[1] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[2] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[3] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).
[4] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.
[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[6] Makoto Yamada,et al. FsNet: Feature Selection Network on High-dimensional Biological Data , 2020, ArXiv.
[7] Wei Hu,et al. Width Provably Matters in Optimization for Deep Linear Neural Networks , 2019, ICML.
[8] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[9] Wei Hu,et al. Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced , 2018, NeurIPS.
[10] Nathan Srebro,et al. How do infinite width bounded norm networks look in function space? , 2019, COLT.
[11] Jin Keun Seo,et al. Framelet pooling aided deep learning network: the method to process high dimensional medical data , 2019, Mach. Learn. Sci. Technol..
[12] Mert Pilanci,et al. Convex Neural Autoregressive Models: Towards Tractable, Expressive, and Theoretically-Backed Models for Sequential Forecasting and Generation , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Mert Pilanci,et al. Convex Optimization for Shallow Neural Networks , 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[14] Mert Pilanci,et al. Convex Duality and Cutting Plane Methods for Over-parameterized Neural Networks , 2019 .
[15] Wei Hu,et al. A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks , 2018, ICLR.
[16] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[17] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[18] Ohad Shamir,et al. Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks , 2018, COLT.
[19] Mert Pilanci,et al. Convex Geometry of Two-Layer ReLU Networks: Implicit Autoencoding and Interpretable Models , 2020, AISTATS.
[20] Qiang Liu,et al. On the Margin Theory of Feedforward Neural Networks , 2018, ArXiv.
[21] Robert D. Nowak,et al. Minimum "Norm" Neural Networks are Splines , 2019, ArXiv.
[22] Ji Zhu,et al. l1 Regularization in Infinite Dimensional Feature Spaces , 2007, COLT.
[23] Ruslan Salakhutdinov,et al. Deep Neural Networks with Multi-Branch Architectures Are Intrinsically Less Non-Convex , 2019, AISTATS.
[24] Lei Huang,et al. Decorrelated Batch Normalization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[25] Thomas Laurent,et al. Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global , 2017, ICML.
[26] Colin Wei,et al. Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel , 2018, NeurIPS.
[27] Mert Pilanci,et al. Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms , 2020, ICLR.
[28] Francis Bach,et al. A Note on Lazy Training in Supervised Differentiable Programming , 2018, ArXiv.
[29] Morteza Mardani,et al. Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization , 2021, ICLR.
[30] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[31] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.
[32] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[33] Nathan Srebro,et al. Implicit Bias of Gradient Descent on Linear Convolutional Networks , 2018, NeurIPS.
[34] Sylvain Gelly,et al. Gradient Descent Quantizes ReLU Network Features , 2018, ArXiv.
[35] Mert Pilanci,et al. Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time , 2021, ICLR.
[36] Mert Pilanci,et al. Convex Geometry and Duality of Over-parameterized Neural Networks , 2020, J. Mach. Learn. Res..
[37] David L. Donoho,et al. Prevalence of neural collapse during the terminal phase of deep learning training , 2020, Proceedings of the National Academy of Sciences.
[38] Matus Telgarsky,et al. Gradient descent aligns the layers of deep linear networks , 2018, ICLR.
[39] Shai Shalev-Shwartz,et al. SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data , 2017, ICLR.
[40] Yu-Chiang Frank Wang,et al. A Closer Look at Few-shot Classification , 2019, ICLR.
[41] Mert Pilanci,et al. Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-Layer Networks , 2020, ICML.
[42] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..