论文信息 - Reexamining Low Rank Matrix Factorization for Trace Norm Regularization

Reexamining Low Rank Matrix Factorization for Trace Norm Regularization

Trace norm regularization is a widely used approach for learning low rank matrices. A standard optimization strategy is based on formulating the problem as one of low rank matrix factorization which, however, leads to a non-convex problem. In practice this approach works well, and it is often computationally faster than standard convex solvers such as proximal gradient methods. Nevertheless, it is not guaranteed to converge to a global optimum, and the optimization can be trapped at poor stationary points. In this paper we show that it is possible to characterize all critical points of the non-convex problem. This allows us to provide an efficient criterion to determine whether a critical point is also a global minimizer. Our analysis suggests an iterative meta-algorithm that dynamically expands the parameter space and allows the optimization to escape any non-global critical point, thereby converging to a global minimizer. The algorithm can be applied to problems such as matrix completion or multitask learning, and our analysis holds for any random initialization of the factor matrices. Finally, we confirm the good performance of the algorithm on synthetic and real datasets.

[1] Matthijs Douze,et al. Large-scale image classification with trace-norm regularization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Hédy Attouch,et al. On the convergence of the proximal algorithm for nonsmooth functions involving analytic features , 2008, Math. Program..

[3] G. Jameson. Summing and nuclear norms in Banach space theory , 1987 .

[4] Heinz H. Bauschke,et al. Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[5] Zaïd Harchaoui,et al. Lifted coordinate descent for learning with trace-norm regularization , 2012, AISTATS.

[6] J. Bolte,et al. Characterizations of Lojasiewicz inequalities: Subgradient flows, talweg, convexity , 2009 .

[7] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.

[8] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[9] Katta G. Murty,et al. Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[10] A. Lewis. The Convex Analysis of Unitarily Invariant Matrix Functions , 1995 .

[11] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .

[12] Peder A. Olsen,et al. Nuclear Norm Minimization via Active Subspace Selection , 2014, ICML.

[13] Patrick Seemann,et al. Matrix Factorization Techniques for Recommender Systems , 2014 .

[14] Tommi S. Jaakkola,et al. Maximum-Margin Matrix Factorization , 2004, NIPS.

[15] René Vidal,et al. Global Optimality in Tensor Factorization, Deep Learning, and Beyond , 2015, ArXiv.

[16] Tengyu Ma,et al. Matrix Completion has No Spurious Local Minimum , 2016, NIPS.

[17] Shimon Ullman,et al. Uncovering shared structures in multiclass classification , 2007, ICML '07.

[18] David P. Woodruff. Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[19] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.

[20] Massimiliano Pontil,et al. Convex multi-task feature learning , 2008, Machine Learning.

[21] Trevor J. Hastie,et al. Matrix completion and low-rank SVD via fast alternating least squares , 2014, J. Mach. Learn. Res..

[22] Nathan Srebro,et al. Global Optimality of Local Search for Low Rank Matrix Recovery , 2016, NIPS.

[23] Francis R. Bach,et al. Consistency of trace norm minimization , 2007, J. Mach. Learn. Res..

[24] Nathan Srebro,et al. Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[25] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[26] Francis R. Bach,et al. A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization , 2008, J. Mach. Learn. Res..