Maximum-and-Concatenation Networks
暂无分享,去创建一个
Zhouchen Lin | Hao Kong | Jianlong Wu | Xingyu Xie | Guangcan Liu | Wayne Zhang | Zhouchen Lin | Guangcan Liu | Jianlong Wu | Wayne Zhang | Xingyu Xie | Hao Kong
[1] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[2] Zuowei Shen,et al. Deep Network Approximation for Smooth Functions , 2020, ArXiv.
[3] Dmitry Yarotsky,et al. Optimal approximation of continuous functions by very deep ReLU networks , 2018, COLT.
[4] Chen Ling,et al. The Best Rank-1 Approximation of a Symmetric Tensor and Related Spherical Optimization Problems , 2012, SIAM J. Matrix Anal. Appl..
[5] Mikhail Belkin,et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.
[6] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Leslie Pack Kaelbling,et al. Effect of Depth and Width on Local Minima in Deep Learning , 2018, Neural Computation.
[8] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[9] Kaiming He,et al. Group Normalization , 2018, ECCV.
[10] Xiaohan Chen,et al. ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA , 2018, ICLR.
[11] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[12] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[13] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[14] Tengyuan Liang,et al. On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels , 2019, COLT.
[15] Leslie Pack Kaelbling,et al. Elimination of All Bad Local Minima in Deep Learning , 2019, AISTATS.
[16] Mikhail Belkin,et al. Does data interpolation contradict statistical optimality? , 2018, AISTATS.
[17] Sivaraman Balakrishnan,et al. How Many Samples are Needed to Estimate a Convolutional Neural Network? , 2018, NeurIPS.
[18] R. Srikant,et al. Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity , 2019, SIAM J. Optim..
[19] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[20] Colin Wei,et al. Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation , 2019, NeurIPS.
[21] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[23] Yuan Cao,et al. A Generalization Theory of Gradient Descent for Learning Over-parameterized Deep ReLU Networks , 2019, ArXiv.
[24] Le Song,et al. Diverse Neural Network Learns True Target Functions , 2016, AISTATS.
[25] Karthik Sridharan,et al. Empirical Entropy, Minimax Regret and Minimax Risk , 2013, ArXiv.
[26] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[27] Elad Hoffer,et al. Fix your classifier: the marginal value of training the last weight layer , 2018, ICLR.
[28] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.
[29] Matus Telgarsky,et al. Benefits of Depth in Neural Networks , 2016, COLT.
[30] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[31] Rémi Munos,et al. Compressed Least-Squares Regression , 2009, NIPS.
[32] Kenji Kawaguchi,et al. Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit , 2019, ArXiv.
[33] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[34] Dmitry Yarotsky,et al. Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.
[35] Mikhail Belkin,et al. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate , 2018, NeurIPS.
[36] R. Srikant,et al. Understanding the Loss Surface of Neural Networks for Binary Classification , 2018, ICML.
[37] Charles R. Johnson,et al. Topics in Matrix Analysis , 1991 .
[38] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[39] E Weinan,et al. On the Generalization Properties of Minimum-norm Solutions for Over-parameterized Neural Network Models , 2019, ArXiv.
[40] Tengyuan Liang,et al. Training Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation Benefits , 2019, Journal of the American Statistical Association.
[41] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[42] Mikhail Belkin,et al. Reconciling modern machine learning and the bias-variance trade-off , 2018, ArXiv.
[43] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[44] Sheehan Olver,et al. On the convergence rate of a modified Fourier series , 2009, Math. Comput..
[45] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[46] Ulrike von Luxburg,et al. Distance-Based Classification with Lipschitz Functions , 2004, J. Mach. Learn. Res..
[47] Lei Wu,et al. A Priori Estimates of the Generalization Error for Two-layer Neural Networks , 2018, Communications in Mathematical Sciences.
[48] Tengyu Ma,et al. Identity Matters in Deep Learning , 2016, ICLR.
[49] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[50] Lei Xu,et al. Input Convex Neural Networks : Supplementary Material , 2017 .
[51] C. D. Boor,et al. Polynomial interpolation in several variables , 1994 .
[52] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.
[53] Guangcan Liu,et al. Differentiable Linearized ADMM , 2019, ICML.
[54] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[55] Daan Huybrechs,et al. From high oscillation to rapid approximation IV: accelerating convergence , 2011 .
[56] Quanquan Gu,et al. Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks , 2019, AAAI.
[57] R. Srikant,et al. Adding One Neuron Can Eliminate All Bad Local Minima , 2018, NeurIPS.
[58] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[59] Johannes Schmidt-Hieber,et al. Nonparametric regression using deep neural networks with ReLU activation function , 2017, The Annals of Statistics.
[60] Ben Adcock,et al. Multivariate modified Fourier series and application to boundary value problems , 2010, Numerische Mathematik.
[61] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[62] Wei Hu,et al. A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks , 2018, ICLR.