暂无分享,去创建一个
Shay B. Cohen | Chunchuan Lyu | Zhunxuan Wang | Linyun He | Chunchuan Lyu | Zhunxuan Wang | Linyun He
[1] Zhize Li,et al. Learning Two-layer Neural Networks with Symmetric Inputs , 2018, ICLR.
[2] A. Wald,et al. On Stochastic Limit and Order Relationships , 1943 .
[3] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[4] Arun K. Kuchibhotla,et al. Efficient Estimation in Convex Single Index Models , 2017 .
[5] Yin Zhang,et al. On the Superlinear and Quadratic Convergence of Primal-Dual Interior Point Linear Programming Algorithms , 1992, SIAM J. Optim..
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Mahdi Soltanolkotabi,et al. Learning ReLUs via Gradient Descent , 2017, NIPS.
[8] Tengyu Ma,et al. Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.
[9] Yoshua Bengio,et al. On the Expressive Power of Deep Architectures , 2011, ALT.
[10] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[11] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[12] Mary C. Meyer. A Simple New Algorithm for Quadratic Programming with Applications in Statistics , 2013, Commun. Stat. Simul. Comput..
[13] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.
[14] D. Luenberger. Optimization by Vector Space Methods , 1968 .
[15] Florian Jarre,et al. On the convergence of the method of analytic centers when applied to convex quadratic programs , 1991, Math. Program..
[16] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[17] I. Johnstone. High Dimensional Statistical Inference and Random Matrices , 2006, math/0611589.
[18] Adityanand Guntuboyina,et al. Nonparametric Shape-Restricted Regression , 2017, Statistical Science.
[19] Simon S. Du,et al. Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps , 2018, ArXiv.
[20] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[21] Stephen P. Boyd,et al. Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.
[22] Adam Tauman Kalai,et al. Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression , 2011, NIPS.
[23] Narendra Karmarkar,et al. A new polynomial-time algorithm for linear programming , 1984, Comb..
[24] Alexandros G. Dimakis,et al. Learning Distributions Generated by One-Layer ReLU Networks , 2019, NeurIPS.
[25] Yinyu Ye,et al. An extension of Karmarkar's projective algorithm for convex quadratic programming , 1989, Math. Program..
[26] Yuanzhi Li,et al. What Can ResNet Learn Efficiently, Going Beyond Kernels? , 2019, NeurIPS.
[27] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[28] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[29] Katta G. Murty,et al. Computational complexity of parametric linear programming , 1980, Math. Program..
[30] E. Seijo,et al. Nonparametric Least Squares Estimation of a Multivariate Convex Regression Function , 2010, 1003.4765.
[31] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[32] L. Khachiyan,et al. The polynomial solvability of convex quadratic programming , 1980 .
[33] Constance Van Eeden. Maximum Likelihood Estimation Of Ordered Probabilities1) , 1956 .
[34] R. Jennrich. Asymptotic Properties of Non-Linear Least Squares Estimators , 1969 .
[35] H. D. Brunk,et al. AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION , 1955 .
[36] Alexander Shapiro,et al. Lectures on Stochastic Programming: Modeling and Theory , 2009 .
[37] Kim-Chuan Toh,et al. SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .
[38] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[39] H. Robbins,et al. Strong consistency of least squares estimates in multiple regression , 1978 .
[40] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[41] Adam R. Klivans,et al. Learning Neural Networks with Two Nonlinear Layers in Polynomial Time , 2017, COLT.
[42] M. Borel. Les probabilités dénombrables et leurs applications arithmétiques , 1909 .
[43] Yuanzhi Li,et al. Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.
[44] R. Samworth,et al. Generalized additive and index models with shape constraints , 2014, 1404.2957.
[45] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[46] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[47] H. D. Brunk. Maximum Likelihood Estimates of Monotone Parameters , 1955 .
[48] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[49] Renato D. C. Monteiro,et al. Interior path following primal-dual algorithms. part II: Convex quadratic programming , 1989, Math. Program..
[50] David A. Freedman,et al. Statistical Models: Theory and Practice: References , 2005 .
[51] Aditya Bhaskara,et al. Provable Bounds for Learning Some Deep Representations , 2013, ICML.
[52] Yuandong Tian,et al. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis , 2017, ICML.