Variable projection without smoothness

Variable projection is a powerful technique in optimization. Over the last 30 years, it has been applied broadly, with empirical and theoretical results demonstrating both greater efficacy and greater stability than competing approaches. In this paper, we illustrate the technique on a large class of structured nonsmooth optimization problems, with numerical examples in sparse deconvolution and machine learning applications.

[1]  George Biros,et al.  Inexactness Issues in the Lagrange-Newton-Krylov-Schur Method for PDE-constrained Optimization , 2003 .

[2]  Saburou Saitoh,et al.  Theory of Reproducing Kernels and Its Applications , 1988 .

[3]  Mark W. Schmidt,et al.  Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.

[4]  M. G. Delgado,et al.  Optimal control and partial differential equations , 2004 .

[5]  Dmitriy Drusvyatskiy,et al.  Efficient Quadratic Penalization Through the Partial Minimization Technique , 2018, IEEE Transactions on Automatic Control.

[6]  Sabine Van Huffel,et al.  Overview of total least-squares methods , 2007, Signal Process..

[7]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[8]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[9]  James V. Burke,et al.  Algorithmic Differentiation of Implicit Functions and Optimal Values , 2008 .

[10]  Sabine Van Huffel,et al.  Exponential Data Fitting and Its Applications , 2018 .

[11]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[12]  G. Golub,et al.  Separable nonlinear least squares: the variable projection method and its applications , 2003 .

[13]  Gene H. Golub,et al.  An analysis of the total least squares problem , 1980, Milestones in Matrix Computation.

[15]  Miguel Á. Carreira-Perpiñán,et al.  Projection onto the probability simplex: An efficient algorithm with a simple proof, and an application , 2013, ArXiv.

[16]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[17]  Yurii Nesterov,et al.  First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.

[18]  Dmitriy Drusvyatskiy,et al.  Quadratic Penalization Through the Variable Projection Technique , 2016 .

[19]  Aleksandr Y. Aravkin,et al.  A SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning , 2016, ArXiv.

[20]  Gene H. Golub,et al.  The differentiation of pseudo-inverses and non-linear least squares problems whose variables separate , 1972, Milestones in Matrix Computation.

[21]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[22]  Aleksandr Y. Aravkin,et al.  Sparse/robust estimation and Kalman smoothing with nonsmooth log-concave densities: modeling, computation, and theory , 2013, J. Mach. Learn. Res..

[23]  A. Hielscher,et al.  Optical tomography as a PDE-constrained optimization problem , 2005 .

[24]  Laurent Condat Fast projection onto the simplex and the l1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pmb {l}_\mathbf {1}$$\end{ , 2015, Mathematical Programming.

[25]  R. Plessix A review of the adjoint-state method for computing the gradient of a functional with geophysical applications , 2006 .

[26]  F. Herrmann,et al.  Robust estimation of primaries by sparse inversion via one-norm minimization , 2013 .

[27]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[28]  R. Koenker,et al.  Regression Quantiles , 2007 .

[29]  S. Hanasoge FULL WAVEFORM INVERSION OF SOLAR INTERIOR FLOWS , 2014, 1410.1981.

[30]  Dianne P. O'Leary,et al.  Variable projection for nonlinear least squares problems , 2012, Computational Optimization and Applications.

[31]  D. Ruppert,et al.  Trimmed Least Squares Estimation in the Linear Model , 1980 .

[32]  E. Haber,et al.  On optimization techniques for solving nonlinear inverse problems , 2000 .

[33]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[34]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[35]  齋藤 三郎,et al.  Theory of reproducing kernels and its applications , 1988 .

[36]  Laurent Condat,et al.  A Fast Projection onto the Simplex and the l 1 Ball , 2015 .

[37]  Axel Ruhe,et al.  Algorithms for separable nonlinear least squares problems , 1980 .

[38]  Margaret Cheney,et al.  The Linear Sampling Method and the MUSIC Algorithm , 2001 .

[39]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[40]  L. N. Vicente,et al.  Trust-Region Interior-Point SQP Algorithms for a Class of Nonlinear Programming Problems , 1998 .

[41]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[42]  E. Haber,et al.  Blind deconvolution of seismograms regularized via minimum support , 2010 .

[43]  T. Leeuwen,et al.  A penalty method for PDE-constrained optimization in inverse problems , 2015, 1504.02249.

[44]  Michael R. Osborne,et al.  SEPARABLE LEAST SQUARES, VARIABLE PROJECTION, AND THE GAUSS-NEWTON ALGORITHM ∗ , 2007 .

[45]  Aleksandr Y. Aravkin,et al.  Estimating nuisance parameters in inverse problems , 2012, 1206.6532.

[46]  Eunho Yang,et al.  Robust Gaussian Graphical Modeling with the Trimmed Graphical Lasso , 2015, NIPS.

[47]  A. Gilbert,et al.  A generalization of variable elimination for separable inverse problems beyond least squares , 2013, 1302.0441.

[48]  Adrian S. Lewis,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[49]  Alfred O. Hero,et al.  Semi-Blind Sparse Image Reconstruction With Application to MRFM , 2012, IEEE Transactions on Image Processing.

[50]  I.,et al.  Optimal Control and Partial Differential Equations , 2004 .

[51]  Mário A. T. Figueiredo,et al.  Parameter Estimation for Blind and Non-Blind Deblurring Using Residual Whiteness Measures , 2013, IEEE Transactions on Image Processing.