Learning a Single Neuron with Gradient Methods
暂无分享,去创建一个
[1] Quanquan Gu,et al. Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks , 2019, AAAI.
[2] Yan Shuo Tan,et al. Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval , 2019, ArXiv.
[3] Gilad Yehudai,et al. On the Power and Limitations of Random Features for Understanding Neural Networks , 2019, NeurIPS.
[4] Yuan Cao,et al. A Generalization Theory of Gradient Descent for Learning Over-parameterized Deep ReLU Networks , 2019, ArXiv.
[5] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[6] Amir Salman Avestimehr,et al. Fitting ReLUs via SGD and Quantized SGD , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).
[7] Samet Oymak,et al. Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path? , 2018, ICML.
[8] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[9] Samet Oymak,et al. Stochastic Gradient Descent Learns State Equations with Nonlinear Activations , 2018, COLT.
[10] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[11] Karthik Sridharan,et al. Uniform Convergence of Gradients for Non-Convex Learning and Optimization , 2018, NeurIPS.
[12] Ohad Shamir,et al. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.
[13] Yuandong Tian,et al. When is a Convolutional Filter Easy To Learn? , 2017, ICLR.
[14] Ohad Shamir,et al. Distribution-Specific Hardness of Learning Neural Networks , 2016, J. Mach. Learn. Res..
[15] A. Montanari,et al. The landscape of empirical risk for nonconvex losses , 2016, The Annals of Statistics.
[16] Mahdi Soltanolkotabi,et al. Learning ReLUs via Gradient Descent , 2017, NIPS.
[17] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[18] Yuandong Tian,et al. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis , 2017, ICML.
[19] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[20] Amit Daniely,et al. SGD Learns the Conjugate Kernel Class of the Network , 2017, NIPS.
[21] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[22] Varun Kanade,et al. Reliably Learning the ReLU in Polynomial Time , 2016, COLT.
[23] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).
[24] John Wright,et al. When Are Nonconvex Problems Not Scary? , 2015, ArXiv.
[25] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[26] Ohad Shamir,et al. A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate , 2014, ICML.
[27] Ohad Shamir. A Variant of Azuma's Inequality for Martingales with Subgaussian Tails , 2011, ArXiv.
[28] Adam Tauman Kalai,et al. Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression , 2011, NIPS.
[29] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[30] Adam Tauman Kalai,et al. The Isotron Algorithm: High-Dimensional Isotonic Regression , 2009, COLT.
[31] W. Hoeffding. Probability inequalities for sum of bounded random variables , 1963 .