Higher-Order Total Variation Classes on Grids: Minimax Theory and Trend Filtering Methods

We consider the problem of estimating the values of a function over $n$ nodes of a $d$-dimensional grid graph (having equal side lengths $n^{1/d}$) from noisy observations. The function is assumed to be smooth, but is allowed to exhibit different amounts of smoothness at different regions in the grid. Such heterogeneity eludes classical measures of smoothness from nonparametric statistics, such as Holder smoothness. Meanwhile, total variation (TV) smoothness classes allow for heterogeneity, but are restrictive in another sense: only constant functions count as perfectly smooth (achieve zero TV). To move past this, we define two new higher-order TV classes, based on two ways of compiling the discrete derivatives of a parameter across the nodes. We relate these two new classes to Holder classes, and derive lower bounds on their minimax errors. We also analyze two naturally associated trend filtering methods; when $d=2$, each is seen to be rate optimal over the appropriate class.

[1]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[2]  Stephen P. Boyd,et al.  1 Trend Filtering , 2009, SIAM Rev..

[3]  P. Lions,et al.  Image recovery via total variation minimization and related problems , 1997 .

[4]  Stephan Didas,et al.  Splines in Higher Order TV Regularization , 2006, International Journal of Computer Vision.

[5]  Z. Harchaoui,et al.  Multiple Change-Point Estimation With a Total Variation Penalty , 2010 .

[6]  Albrecht Böttcher,et al.  Eigenvectors of Hermitian Toeplitz matrices with smooth simple-loop symbols , 2016 .

[7]  Albrecht Böttcher,et al.  Spectral properties of banded Toeplitz matrices , 1987 .

[8]  A. Rinaldo Properties and refinements of the fused lasso , 2008, 0805.0234.

[9]  A. Tsybakov,et al.  Minimax theory of image reconstruction , 1993 .

[10]  Karl Kunisch,et al.  Total Generalized Variation , 2010, SIAM J. Imaging Sci..

[11]  Antonin Chambolle,et al.  On Total Variation Minimization and Surface Evolution Using Parametric Maximum Flows , 2009, International Journal of Computer Vision.

[12]  Alexander J. Smola,et al.  Trend Filtering on Graphs , 2014, J. Mach. Learn. Res..

[13]  Laurent Condat,et al.  A Direct Algorithm for 1-D Total Variation Denoising , 2013, IEEE Signal Processing Letters.

[14]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[15]  Holger Hoefling A Path Algorithm for the Fused Lasso Signal Approximator , 2009, 0910.0526.

[16]  J. Strikwerda Finite Difference Schemes and Partial Differential Equations , 1989 .

[17]  O. Scherzer,et al.  Characterization of minimizers of convex regularization functionals , 2006 .

[18]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[19]  C. Neuman,et al.  Discrete (Legendre) orthogonal polynomials—a survey , 1974 .

[20]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[21]  Albrecht Böttcher,et al.  On the Structure of the Eigenvectors of Large Hermitian Toeplitz Band Matrices , 2010 .

[22]  Peter Kulchyski and , 2015 .

[23]  R. Tibshirani,et al.  Additive models with trend filtering , 2017, The Annals of Statistics.

[24]  Alexander J. Smola,et al.  The Falling Factorial Basis and Its Statistical Applications , 2014, ICML.

[25]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[26]  D. Donoho,et al.  Minimax Risk Over Hyperrectangles, and Implications , 1990 .

[27]  A. Kovac,et al.  Nonparametric Regression on a Graph , 2011 .

[28]  R. Tibshirani Adaptive piecewise polynomial estimation via trend filtering , 2013, 1304.2986.

[29]  I. Johnstone,et al.  Minimax estimation via wavelet shrinkage , 1998 .

[30]  Nicholas A. Johnson,et al.  A Dynamic Programming Algorithm for the Fused Lasso and L 0-Segmentation , 2013 .

[31]  Suvrit Sra,et al.  Modular Proximal Optimization for Multidimensional Total-Variation Regularization , 2014, J. Mach. Learn. Res..

[32]  James G. Scott,et al.  A Fast and Flexible Algorithm for the Graph-Fused Lasso , 2015, 1505.06475.

[33]  Yu-Xiang Wang,et al.  Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers , 2016, NIPS.

[34]  S. Geer,et al.  Locally adaptive regression splines , 1997 .

[35]  James G. Scott,et al.  The DFS Fused Lasso: Linear-Time Denoising over General Graphs , 2016, J. Mach. Learn. Res..

[36]  P. Rigollet,et al.  Optimal rates for total variation denoising , 2016, 1603.09388.