暂无分享,去创建一个
[1] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[2] David P. Woodru. Sketching as a Tool for Numerical Linear Algebra , 2014 .
[3] Anima Anandkumar,et al. Provable Methods for Training Neural Networks with Sparse Connectivity , 2014, ICLR.
[4] Raman Arora,et al. Understanding Deep Neural Networks with Rectified Linear Units , 2016, Electron. Colloquium Comput. Complex..
[5] Varun Kanade,et al. Reliably Learning the ReLU in Polynomial Time , 2016, COLT.
[6] Yin Tat Lee,et al. A Faster Cutting Plane Method and its Implications for Combinatorial and Convex Optimization , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.
[7] Roman Vershynin,et al. High-Dimensional Probability , 2018 .
[8] Simon S. Du,et al. Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps , 2018, ArXiv.
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] David Gross,et al. Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.
[11] Rocco A. Servedio,et al. Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).
[12] Mahdi Soltanolkotabi,et al. Learning ReLUs via Gradient Descent , 2017, NIPS.
[13] Tamás Sarlós,et al. Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[14] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.
[15] Huan Wang,et al. Exact Recovery of Sparsely-Used Dictionaries , 2012, COLT.
[16] Misha Denil,et al. Predicting Parameters in Deep Learning , 2014 .
[17] Santosh S. Vempala,et al. Fourier PCA and robust tensor decomposition , 2013, STOC.
[18] Yuchen Zhang,et al. L1-regularized Neural Networks are Improperly Learnable in Polynomial Time , 2015, ICML.
[19] Raghu Meka,et al. Learning One Convolutional Layer with Overlapping Patches , 2018, ICML.
[20] Tengyu Ma,et al. Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.
[21] Anima Anandkumar,et al. Score Function Features for Discriminative Learning: Matrix and Tensor Framework , 2014, ArXiv.
[22] Andrea Montanari,et al. On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition , 2018, AISTATS.
[23] Zhize Li,et al. Learning Two-layer Neural Networks with Symmetric Inputs , 2018, ICLR.
[24] Anima Anandkumar,et al. Provable Tensor Methods for Learning Mixtures of Generalized Linear Models , 2014, AISTATS.
[25] A. Dasgupta. Asymptotic Theory of Statistics and Probability , 2008 .
[26] Ankur Moitra,et al. Noisy tensor completion via the sum-of-squares hierarchy , 2015, Mathematical Programming.
[27] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[28] P. Massart,et al. Adaptive estimation of a quadratic functional by model selection , 2000 .
[29] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..
[30] Tengyu Ma,et al. Decomposing Overcomplete 3rd Order Tensors using Sum-of-Squares Algorithms , 2015, APPROX-RANDOM.
[31] Yuanzhi Li,et al. Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.
[32] Anima Anandkumar,et al. Online and Differentially-Private Tensor Decomposition , 2016, NIPS.
[33] Michael W. Mahoney. Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..
[34] Adam R. Klivans,et al. Learning Neural Networks with Two Nonlinear Layers in Polynomial Time , 2017, COLT.
[35] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[36] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..
[37] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.
[38] Sanjeev Arora,et al. Provable learning of noisy-OR networks , 2016, STOC.
[39] Santosh S. Vempala,et al. Learning Convex Concepts from Gaussian Distributions with PCA , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.
[40] J. Stephen Judd,et al. Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.
[41] Christopher J. Hillar,et al. Most Tensor Problems Are NP-Hard , 2009, JACM.
[42] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[43] W. Bryc. The Normal Distribution: Characterizations with Applications , 1995 .
[44] Sanjeev Arora,et al. Provable ICA with Unknown Gaussian Noise, and Implications for Gaussian Mixtures and Autoencoders , 2012, Algorithmica.
[45] Sham M. Kakade,et al. Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.
[46] Aapo Hyvärinen,et al. Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.
[47] Guanghui Lan,et al. Complexity of Training ReLU Neural Network , 2018, Discret. Optim..
[48] Andrea Montanari,et al. Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.
[49] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[50] E. Candès,et al. Sparsity and incoherence in compressive sampling , 2006, math/0611957.
[51] Aleksander Madry,et al. Matrix Scaling and Balancing via Box Constrained Newton's Method and Interior Point Methods , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).
[52] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[53] Adam R. Klivans,et al. Learning Depth-Three Neural Networks in Polynomial Time , 2017, ArXiv.
[54] Pasin Manurangsi,et al. The Computational Complexity of Training ReLU(s) , 2018, ArXiv.
[55] Yaniv Plan,et al. Robust 1-bit Compressed Sensing and Sparse Logistic Regression: A Convex Programming Approach , 2012, IEEE Transactions on Information Theory.
[56] Anima Anandkumar,et al. Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation , 2012, NIPS 2012.
[57] Aditya Bhaskara,et al. Provable Bounds for Learning Some Deep Representations , 2013, ICML.
[58] David P. Woodruff. Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..
[59] Yuandong Tian,et al. Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity , 2017, ICLR.
[60] Yuandong Tian,et al. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis , 2017, ICML.
[61] David P. Woodruff,et al. Sublinear Time Orthogonal Tensor Decomposition , 2016, NIPS.
[62] David Steurer,et al. Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method , 2014, STOC.
[63] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..
[64] Martin J. Wainwright,et al. High-Dimensional Statistics , 2019 .
[65] Adam Tauman Kalai,et al. Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression , 2011, NIPS.
[66] L. Meng,et al. The optimal perturbation bounds of the Moore–Penrose inverse under the Frobenius norm , 2010 .
[67] Moritz Hardt,et al. Understanding Alternating Minimization for Matrix Completion , 2013, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.
[68] Alan M. Frieze,et al. Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.
[69] Yi Ma,et al. Robust principal component analysis? , 2009, JACM.
[70] Daniel Soudry,et al. No bad local minima: Data independent training error guarantees for multilayer neural networks , 2016, ArXiv.
[71] Piotr Indyk,et al. Stable distributions, pseudorandom generators, embeddings, and data stream computation , 2006, JACM.
[72] Prateek Jain,et al. Low-rank matrix completion using alternating minimization , 2012, STOC '13.
[73] M. Rudelson,et al. Non-asymptotic theory of random matrices: extreme singular values , 2010, 1003.2990.
[74] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[75] Aditya Bhaskara,et al. Smoothed analysis of tensor decompositions , 2013, STOC.
[76] Alan M. Frieze,et al. Learning linear transformations , 1996, Proceedings of 37th Conference on Foundations of Computer Science.
[77] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[78] Guanghui Lan,et al. Complexity of Training ReLU Neural Networks , 2018 .
[79] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.
[80] Nimrod Megiddo,et al. On the complexity of polyhedral separability , 1988, Discret. Comput. Geom..