论文信息 - Iterative regularization for convex regularizers

Iterative regularization for convex regularizers

We study iterative regularization for linear models, when the bias is convex but not necessarily strongly convex. We characterize the stability properties of a primal-dual gradient based approach, analyzing its convergence in the presence of worst case deterministic noise. As a main example, we specialize and illustrate the results for the problem of robust sparse recovery. Key to our analysis is a combination of ideas from regularization theory and optimization in the presence of errors. Theoretical results are complemented by experiments showing that state-of-the-art performances can be achieved with considerable computational speed-ups.

[1] Xavier Bresson,et al. Bregmanized Nonlocal Regularization for Deconvolution and Sparse Reconstruction , 2010, SIAM J. Imaging Sci..

[2] Wotao Yin,et al. An Iterative Regularization Method for Total Variation-Based Image Restoration , 2005, Multiscale Model. Simul..

[3] Trevor Hastie,et al. Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[4] Y. Yao,et al. On Early Stopping in Gradient Descent Learning , 2007 .

[5] J. Peypouquet. Convex Optimization in Normed Spaces , 2015 .

[6] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.

[7] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .

[8] Antonin Chambolle,et al. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[9] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[10] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .

[11] Julian Rasch,et al. Inexact first-order primal–dual algorithms , 2018, Computational Optimization and Applications.

[12] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[13] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[14] I. Daubechies,et al. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[15] Frank Schöpfer. Exact Regularization of Polyhedral Norms , 2012, SIAM J. Optim..

[16] Bin Yu,et al. Boosting with early stopping: Convergence and consistency , 2005, math/0508276.

[17] Stanley Osher,et al. A Unified Primal-Dual Algorithm Framework Based on Bregman Iteration , 2010, J. Sci. Comput..

[18] O. Scherzer,et al. Necessary and sufficient conditions for linear convergence of ℓ1‐regularization , 2011 .

[19] Siu Kwan Lam,et al. Numba: a LLVM-based Python JIT compiler , 2015, LLVM '15.

[20] Lorenzo Rosasco,et al. Implicit Regularization of Accelerated Methods in Hilbert Spaces , 2019, NeurIPS.

[21] Lorenzo Rosasco,et al. Accelerated Iterative Regularization via Dual Diagonal Descent , 2019, SIAM J. Optim..

[22] Mark W. Schmidt,et al. Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.

[23] Dirk A. Lorenz,et al. Linear convergence of the randomized sparse Kaczmarz method , 2016, Mathematical Programming.

[24] Martin J. Wainwright,et al. Early stopping for non-parametric regression: An optimal data-dependent stopping rule , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25] K. Jarrod Millman,et al. Array programming with NumPy , 2020, Nat..

[26] Lorenzo Rosasco,et al. Learning with Incremental Iterative Regularization , 2014, NIPS.

[27] Lin He,et al. Error estimation for Bregman iterations and inverse scale space methods in image restoration , 2007, Computing.

[28] Lorenzo Rosasco,et al. Iterative Regularization via Dual Diagonal Descent , 2016, Journal of Mathematical Imaging and Vision.

[29] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[30] Barbara Kaltenbacher,et al. Iterative Regularization Methods for Nonlinear Ill-Posed Problems , 2008, Radon Series on Computational and Applied Mathematics.

[31] Nathan Srebro,et al. Characterizing Implicit Bias in Terms of Optimization Geometry , 2018, ICML.

[32] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[33] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..

[34] Varun Kanade,et al. The Statistical Complexity of Early Stopped Mirror Descent , 2020, NeurIPS.

[35] Martin Benning,et al. Gradient descent in a generalised Bregman distance framework , 2016 .

[36] Michael W. Mahoney. Approximate computation and implicit regularization for very large-scale data analysis , 2012, PODS.

[37] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[38] Paul Tseng,et al. Exact Regularization of Convex Programs , 2007, SIAM J. Optim..

[39] Pascal Bianchi,et al. A Coordinate-Descent Primal-Dual Algorithm with Large Step Size and Possibly Nonseparable Functions , 2015, SIAM J. Optim..

[40] Patrick L. Combettes,et al. Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[41] S. Osher,et al. Sparse Recovery via Differential Inclusions , 2014, 1406.7728.

[42] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[43] Ruslan Salakhutdinov,et al. Geometry of Optimization and Implicit Regularization in Deep Learning , 2017, ArXiv.

[44] Nathan Srebro,et al. Implicit Regularization in Matrix Factorization , 2017, 2018 Information Theory and Applications Workshop (ITA).

[45] Varun Kanade,et al. Implicit Regularization for Optimal Sparse Recovery , 2019, NeurIPS.

[46] L. Rudin,et al. Nonlinear total variation based noise removal algorithms , 1992 .

[47] Trevor Hastie,et al. Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[48] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.

[49] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[50] Martin Burger,et al. Iterative total variation schemes for nonlinear inverse problems , 2009 .

[51] Wotao Yin,et al. Bregman Iterative Algorithms for (cid:2) 1 -Minimization with Applications to Compressed Sensing ∗ , 2008 .

[52] B. Lemaire,et al. Convergence of diagonally stationary sequences in convex optimization , 1994 .

[53] Lorenzo Rosasco,et al. Don't relax: early stopping for convex regularization , 2017, ArXiv.

[54] Martin J. Wainwright,et al. Stochastic optimization and sparse statistical recovery: An optimal algorithm for high dimensions , 2012, 2014 48th Annual Conference on Information Sciences and Systems (CISS).