Prediction bounds for higher order total variation regularized least squares

We establish adaptive results for trend filtering: least squares estimation with a penalty on the total variation of $(k-1)^{\rm th}$ order differences. Our approach is based on combining a general oracle inequality for the $\ell_1$-penalized least squares estimator with "interpolating vectors" to upper-bound the "effective sparsity". This allows one to show that the $\ell_1$-penalty on the $k^{\text{th}}$ order differences leads to an estimator that can adapt to the number of jumps in the $(k-1)^{\text{th}}$ order differences of the underlying signal or an approximation thereof. We show the result for $k \in \{1,2,3,4\}$ and indicate how it could be derived for general $k\in \mathbb{N}$.

[1]  Yu-Xiang Wang,et al.  Higher-Order Total Variation Classes on Grids: Minimax Theory and Trend Filtering Methods , 2017, NIPS.

[2]  R. Tibshirani,et al.  Additive models with trend filtering , 2017, The Annals of Statistics.

[3]  Donovan Lieu,et al.  Adaptive risk bounds in univariate total variation denoising and trend filtering , 2017, The Annals of Statistics.

[4]  S. Geer,et al.  Locally adaptive regression splines , 1997 .

[5]  S. Geer,et al.  Oracle inequalities for square root analysis estimators with application to total variation penalties , 2019, Information and Inference: A Journal of the IMA.

[6]  S. Geer,et al.  Adaptive Rates for Total Variation Image Denoising. , 2020 .

[7]  S. Geer,et al.  On the total variation regularized estimator over a class of tree graphs , 2018, 1806.01009.

[8]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[9]  M.E.Sc. Wieslaw Stepniewski,et al.  The Prediction of Performance , 2013 .

[10]  Emmanuel J. Candès,et al.  Towards a Mathematical Theory of Super‐resolution , 2012, ArXiv.

[11]  Claire Boyer,et al.  Adapting to unknown noise level in sparse deconvolution , 2016, ArXiv.

[12]  S. Geer Logistic regression with total variation regularization , 2020, 2003.02678.

[13]  Emmanuel J. Candès,et al.  A Probabilistic and RIPless Theory of Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[14]  R. Tibshirani Divided Differences, Falling Factorials, and Discrete Splines: Another Look at Trend Filtering and Related Problems , 2020, Found. Trends Mach. Learn..

[15]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[16]  S. Geer,et al.  Oracle inequalities for image denoising with total variation regularization , 2019, 1911.07231.

[17]  R. Tibshirani Adaptive piecewise polynomial estimation via trend filtering , 2013, 1304.2986.

[18]  Alexander J. Smola,et al.  The Falling Factorial Basis and Its Statistical Applications , 2014, ICML.

[19]  Stephan Didas,et al.  Splines in Higher Order TV Regularization , 2006, International Journal of Computer Vision.

[20]  Gongguo Tang,et al.  Near minimax line spectral estimation , 2013, 2013 47th Annual Conference on Information Sciences and Systems (CISS).

[21]  I. Johnstone,et al.  Minimax estimation via wavelet shrinkage , 1998 .

[22]  Junyang Qian,et al.  On stepwise pattern recovery of the fused Lasso , 2016, Comput. Stat. Data Anal..

[23]  Sabyasachi Chatterjee,et al.  Adaptive estimation of multivariate piecewise polynomials and bounded variation functions by optimal decision trees , 2019, The Annals of Statistics.

[24]  A. Belloni,et al.  Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming , 2010, 1009.5689.

[25]  A. Dalalyan,et al.  On the Prediction Performance of the Lasso , 2014, 1402.1700.

[26]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[27]  Michael Elad,et al.  Analysis versus synthesis in signal priors , 2006, 2006 14th European Signal Processing Conference.