Learning to solve TV regularised problems with unrolled algorithms

Total Variation (TV) is a popular regularization strategy that promotes piece-wise constant signals by constraining the l1-norm of the first order derivative of the estimated signal. The resulting optimization problem is usually solved using iterative algorithms such as proximal gradient descent, primal-dual algorithms or ADMM. However, such methods can require a very large number of iterations to converge to a suitable solution. In this paper, we accelerate such iterative algorithms by unfolding proximal gradient descent solvers in order to learn their parameters for 1D TV regularized problems. While this could be done using the synthesis formulation, we demonstrate that this leads to slower performances. The main difficulty in applying such methods in the analysis formulation lies in proposing a way to compute the derivatives through the proximal operator. As our main contribution, we develop and characterize two approaches to do so, describe their benefits and limitations, and discuss the regime where they can actually improve over iterative procedures. We validate those findings with experiments on synthetic and real data.

[1]  Jean Ponce,et al.  Designing and Learning Trainable Priors with Non-Cooperative Games , 2020, ArXiv.

[2]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[3]  Guillermo Sapiro,et al.  Learning Efficient Structured Sparse Models , 2012, ICML.

[4]  J. W. Silverstein On the eigenvectors of large dimensional sample covariance matrices , 1989 .

[5]  Philippe Ciuciu,et al.  Sparsity-based Blind Deconvolution of Neural Activation Signal in FMRI , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Mark W. Schmidt,et al.  Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.

[7]  Yonina C. Eldar,et al.  Tradeoffs Between Convergence Speed and Reconstruction Accuracy in Inverse Problems , 2016, IEEE Transactions on Signal Processing.

[8]  Joan Bruna,et al.  Understanding Neural Sparse Coding with Matrix Factorization , 2016 .

[9]  Mohamed-Jalal Fadili,et al.  Stein Unbiased GrAdient estimator of the Risk (SUGAR) for Multiple Parameter Selection , 2014, SIAM J. Imaging Sci..

[10]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[11]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  Philippe Ciuciu,et al.  fMRI BOLD signal decomposition using a multivariate low-rank model , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[14]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[15]  Steve B. Jiang,et al.  Low-dose CT reconstruction via edge-preserving total variation regularization. , 2010, Physics in medicine and biology.

[16]  Guillermo Sapiro,et al.  Efficient supervised sparse analysis and synthesis operators , 2013, NIPS 2013.

[17]  Suvrit Sra,et al.  Modular Proximal Optimization for Multidimensional Total-Variation Regularization , 2014, J. Mach. Learn. Res..

[18]  Sundeep Rangan,et al.  AMP-Inspired Deep Networks for Sparse Linear Inverse Problems , 2016, IEEE Transactions on Signal Processing.

[19]  Luca Baldassarre,et al.  Optimal Computational Trade-Off of Inexact Proximal Methods , 2012, ArXiv.

[20]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.

[21]  Michael Elad,et al.  On Multi-Layer Basis Pursuit, Efficient Algorithms and Convolutional Neural Networks , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  RodríguezPaul Total variation regularization algorithms for images corrupted with different noise models , 2013 .

[23]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[24]  Michael Elad,et al.  Analysis versus synthesis in signal priors , 2006, 2006 14th European Signal Processing Conference.

[25]  Alexandre Gramfort,et al.  Learning step sizes for unfolded sparse coding , 2019, NeurIPS.

[26]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[27]  Ravi S. Menon,et al.  Intrinsic signal changes accompanying sensory stimulation: functional brain mapping with magnetic resonance imaging. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[28]  P. Davies,et al.  Local Extremes, Runs, Strings and Multiresolution , 2001 .

[29]  Paul A. Rodríguez,et al.  Total Variation Regularization Algorithms for Images Corrupted with Different Noise Models: A Review , 2013, J. Electr. Comput. Eng..

[30]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[31]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[32]  Wen Gao,et al.  Maximal Sparsity with Deep Networks? , 2016, NIPS.

[33]  Dimitri Van De Ville,et al.  Bold Signal Deconvolution Under Uncertain HÆModynamics: A Semi-Blind Approach , 2019, 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019).

[34]  Alexandre Gramfort,et al.  Implicit differentiation of Lasso-type models for hyperparameter optimization , 2020, ICML.

[35]  Jérôme Darbon,et al.  Image Restoration with Discrete Constrained Total Variation Part I: Fast and Exact Optimization , 2006, Journal of Mathematical Imaging and Vision.

[36]  Laurent Condat,et al.  A Direct Algorithm for 1-D Total Variation Denoising , 2013, IEEE Signal Processing Letters.

[37]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[38]  M. Nikolova An Algorithm for Total Variation Minimization and Applications , 2004 .

[39]  Dimitri Van De Ville,et al.  Total activation: fMRI deconvolution through spatio-temporal regularization , 2013, NeuroImage.

[40]  Laurent Condat,et al.  A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms , 2012, Journal of Optimization Theory and Applications.