Ecient and Practical Stochastic Subgradient Descent for Nuclear

We describe novel subgradient methods for a broad class of matrix optimization problems involving nuclear norm regularization. Unlike existing approaches, our method executes very cheap iterations by combining low-rank stochastic subgradients with ecient incremental SVD updates, made possible by highly optimized and parallelizable dense linear algebra operations on small matrices. Our practical algorithms always maintain a low-rank factorization of iterates that can be conveniently held in memory and efciently multiplied to generate predictions in matrix completion settings. Empirical comparisons conrm that our approach is highly competitive with several recently proposed state-of-the-art solvers for such problems.

[1]  Kenneth L. Clarkson,et al.  Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[2]  David F. Gleich,et al.  Tall and skinny QR factorizations in MapReduce architectures , 2011, MapReduce '11.

[3]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[4]  J. Borwein,et al.  Techniques of variational analysis , 2005 .

[5]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[6]  Christopher Ré,et al.  Parallel stochastic gradient algorithms for large-scale matrix completion , 2013, Mathematical Programming Computation.

[7]  Jack Dongarra,et al.  ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[8]  Philipp Birken,et al.  Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.

[9]  Andrew Cotter,et al.  Stochastic Optimization for Machine Learning , 2013, ArXiv.

[10]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[11]  Ohad Shamir,et al.  Large-Scale Convex Minimization with a Low-Rank Constraint , 2011, ICML.

[12]  Martin Jaggi,et al.  A Simple Algorithm for Nuclear Norm Regularized Problems , 2010, ICML.

[13]  S. Mallat,et al.  Adaptive greedy approximations , 1997 .

[14]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[15]  Vikas Sindhwani,et al.  Efficient and Practical Stochastic Subgradient Descent for Nuclear Norm Regularization , 2012, ICML.

[16]  Mark Hoemmen,et al.  Communication-avoiding Krylov subspace methods , 2010 .

[17]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..