The Global Optimization Geometry of Shallow Linear Neural Networks
暂无分享,去创建一个
Yonina C. Eldar | Zhihui Zhu | Michael B. Wakin | Daniel Soudry | M. Wakin | Daniel Soudry | Zhihui Zhu
[1] Yuandong Tian,et al. When is a Convolutional Filter Easy To Learn? , 2017, ICLR.
[2] Elad Hoffer,et al. Exponentially vanishing sub-optimal local minima in multilayer neural networks , 2017, ICLR.
[3] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).
[4] Tengyu Ma,et al. Finding approximate local minima faster than gradient descent , 2016, STOC.
[5] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.
[6] Hassan Mansour,et al. Learning Optimal Nonlinearities for Iterative Thresholding Algorithms , 2015, IEEE Signal Processing Letters.
[7] Thomas Laurent,et al. Deep linear neural networks with arbitrary loss: All local minima are global , 2017, ArXiv.
[8] Junwei Lu,et al. Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization , 2016, ArXiv.
[9] Suvrit Sra,et al. Global optimality conditions for deep neural networks , 2017, ICLR.
[10] Robert J. Schalkoff,et al. Artificial neural networks , 1997 .
[11] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.
[12] Meisam Razaviyayn,et al. Learning Deep Models: Critical Points and Local Openness , 2018, ICLR.
[13] Sundeep Rangan,et al. AMP-Inspired Deep Networks for Sparse Linear Inverse Problems , 2016, IEEE Transactions on Signal Processing.
[14] Daniel Soudry,et al. No bad local minima: Data independent training error guarantees for multilayer neural networks , 2016, ArXiv.
[15] Yihua Tan,et al. Unsupervised Multilayer Feature Learning for Satellite Image Scene Classification , 2016, IEEE Geoscience and Remote Sensing Letters.
[16] Junwei Lu,et al. Symmetry. Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization , 2016, 2018 Information Theory and Applications Workshop (ITA).
[17] René Vidal,et al. Global Optimality in Neural Network Training , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[19] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.
[20] Ohad Shamir,et al. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.
[21] Zhihui Zhu,et al. Global Optimality in Low-Rank Matrix Optimization , 2017, IEEE Transactions on Signal Processing.
[22] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[23] Xin Huang,et al. Deep networks under scene-level supervision for multi-class geospatial object detection from remote sensing images , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.
[24] Richard G. Baraniuk,et al. A deep learning approach to structured signal recovery , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[25] Haihao Lu,et al. Depth Creates No Bad Local Minima , 2017, ArXiv.
[26] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.
[27] John Wright,et al. Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.
[28] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[29] Jorge Nocedal,et al. A trust region method based on interior point techniques for nonlinear programming , 2000, Math. Program..
[30] Zhihui Zhu,et al. The Global Optimization Geometry of Low-Rank Matrix Optimization , 2017, IEEE Transactions on Information Theory.
[31] Yair Carmon,et al. Accelerated Methods for Non-Convex Optimization , 2016, SIAM J. Optim..
[32] Junjie Wu,et al. An Optimal 2-D Spectrum Matching Method for SAR Ground Moving Target Imaging , 2018, IEEE Transactions on Geoscience and Remote Sensing.
[33] Anastasios Kyrillidis,et al. Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach , 2016, AISTATS.
[34] Nicholas I. M. Gould,et al. Trust Region Methods , 2000, MOS-SIAM Series on Optimization.
[35] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[36] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[37] Xiao Li,et al. Nonconvex Robust Low-rank Matrix Recovery , 2018, SIAM J. Optim..
[38] Anthony Man-Cho So,et al. On the Estimation Performance and Convergence Rate of the Generalized Power Method for Phase Synchronization , 2016, SIAM J. Optim..
[39] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[40] Yurii Nesterov,et al. Cubic regularization of Newton method and its global performance , 2006, Math. Program..
[41] Qiuwei Li,et al. The non-convex geometry of low-rank matrix optimization , 2016, Information and Inference: A Journal of the IMA.
[42] Yonina C. Eldar,et al. Convolutional Phase Retrieval via Gradient Descent , 2017, IEEE Transactions on Information Theory.
[43] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[44] Yuandong Tian,et al. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis , 2017, ICML.
[45] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.
[46] Daniel P. Robinson,et al. Exploiting negative curvature in deterministic and stochastic optimization , 2017, Mathematical Programming.
[47] Michael I. Jordan,et al. First-order methods almost always avoid saddle points: The case of vanishing step-sizes , 2019, NeurIPS.
[48] Max Simchowitz,et al. Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.
[49] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[50] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[51] François-Xavier Vialard,et al. An Interpolating Distance Between Optimal Transport and Fisher–Rao Metrics , 2010, Foundations of Computational Mathematics.
[52] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[53] Jorge Nocedal,et al. On the limited memory BFGS method for large scale optimization , 1989, Math. Program..
[54] Yongjun Zhang,et al. Learning Source-Invariant Deep Hashing Convolutional Neural Networks for Cross-Source Remote Sensing Image Retrieval , 2018, IEEE Transactions on Geoscience and Remote Sensing.