论文信息 - Implicit Regularization in Matrix Factorization

Implicit Regularization in Matrix Factorization

We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix $X$ with gradient descent on a factorization of X. We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution.

[1] Eric Jones,et al. SciPy: Open Source Scientific Tools for Python , 2001 .

[2] Renato D. C. Monteiro,et al. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[3] Noga Alon,et al. Generalization Error Bounds for Collaborative Prediction with Low-Rank Matrices , 2004, NIPS.

[4] Adi Shraibman,et al. Rank, Trace-Norm and Max-Norm , 2005, COLT.

[5] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.

[6] Shimon Ullman,et al. Uncovering shared structures in multiclass classification , 2007, ICML '07.

[7] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[8] Francis R. Bach,et al. Low-Rank Optimization on the Cone of Positive Semidefinite Matrices , 2008, SIAM J. Optim..

[9] Nathan Srebro,et al. Concentration-Based Guarantees for Low-Rank Matrix Reconstruction , 2011, COLT.

[10] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[11] Raghunandan H. Keshavan. Efficient algorithms for collaborative filtering , 2012 .

[12] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.

[13] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.

[14] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.

[15] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.

[16] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.

[17] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[18] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.