Vector-Valued Graph Trend Filtering With Non-Convex Penalties

This article studies the denoising of piecewise smooth graph signals that exhibit inhomogeneous levels of smoothness over a graph, where the value at each node can be vector-valued. We extend the graph trend filtering framework to denoising vector-valued graph signals with a family of non-convex regularizers, which exhibit superior recovery performance over existing convex regularizers. Using an oracle inequality, we establish the statistical error rates of first-order stationary points of the proposed non-convex method for generic graphs. Furthermore, we present an ADMM-based algorithm to solve the proposed method and establish its convergence. Numerical experiments are conducted on both synthetic and real-world data for denoising, support recovery, event detection, and semi-supervised classification.

[1]  Tong Zhang,et al.  A General Theory of Concave Regularization for High-Dimensional Sparse Estimation Problems , 2011, 1108.4988.

[2]  A. Rinaldo,et al.  Approximate Recovery in Changepoint Problems, from $\ell_2$ Estimation Error Rates , 2016, 1606.06746.

[3]  Bhaskar D. Rao,et al.  Sparse solutions to linear inverse problems with multiple measurement vectors , 2005, IEEE Transactions on Signal Processing.

[4]  Tony F. Chan,et al.  The digital TV filter and nonlinear denoising , 2001, IEEE Trans. Image Process..

[5]  Avrim Blum,et al.  Foundations of Data Science , 2020 .

[6]  Trac D. Tran,et al.  Hyperspectral Image Classification Using Dictionary-Based Sparse Representation , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Jelena Kovacevic,et al.  Discrete Signal Processing on Graphs: Sampling Theory , 2015, IEEE Transactions on Signal Processing.

[8]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[9]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[10]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[11]  Jelena Kovacevic,et al.  Improving Graph Trend Filtering with Non-convex Penalties , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  M. Murty Ramanujan Graphs , 1965 .

[13]  Po-Ling Loh,et al.  Support recovery without incoherence: A case for nonconvex regularization , 2014, ArXiv.

[14]  Fan Chung Graham,et al.  On the Spectra of General Random Graphs , 2011, Electron. J. Comb..

[15]  Yonina C. Eldar,et al.  Robust Recovery of Signals From a Structured Union of Subspaces , 2008, IEEE Transactions on Information Theory.

[16]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[17]  S. Geer,et al.  Locally adaptive regression splines , 1997 .

[18]  Jian Huang,et al.  A Concave Pairwise Fusion Approach to Subgroup Analysis , 2015, 1508.07045.

[19]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[20]  Po-Ling Loh,et al.  Statistical consistency and asymptotic normality for high-dimensional robust M-estimators , 2015, ArXiv.

[21]  R. Chartrand,et al.  Restricted isometry properties and nonconvex compressive sensing , 2007 .

[22]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[23]  Stephen P. Boyd,et al.  1 Trend Filtering , 2009, SIAM Rev..

[24]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[25]  Alexander J. Smola,et al.  Trend Filtering on Graphs , 2014, J. Mach. Learn. Res..

[26]  A. Dalalyan,et al.  On the Prediction Performance of the Lasso , 2014, 1402.1700.

[27]  P. Rigollet,et al.  Optimal rates for total variation denoising , 2016, 1603.09388.

[28]  Yonina C. Eldar,et al.  Rank Awareness in Joint Sparse Recovery , 2010, IEEE Transactions on Information Theory.

[29]  Alexander Jung,et al.  The Network Nullspace Property for Compressed Sensing of Big Data Over Networks , 2018, ICASSP.

[30]  José M. F. Moura,et al.  Discrete Signal Processing on Graphs , 2012, IEEE Transactions on Signal Processing.

[31]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[32]  Pierre Vandergheynst,et al.  Graph Signal Processing: Overview, Challenges, and Applications , 2017, Proceedings of the IEEE.

[33]  Geoff Boeing,et al.  OSMnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks , 2016, Comput. Environ. Urban Syst..

[34]  Yuejie Chi,et al.  Off-the-Grid Line Spectrum Denoising and Estimation With Multiple Measurement Vectors , 2014, IEEE Transactions on Signal Processing.

[35]  Stephen P. Boyd,et al.  Network Lasso: Clustering and Optimization in Large Graphs , 2015, KDD.

[36]  Alexander Jung,et al.  When is Network Lasso Accurate: The Vector Case , 2017, ArXiv.

[37]  Jelena Kovacevic,et al.  Signal Representations on Graphs: Tools and Applications , 2015, ArXiv.

[38]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[39]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[40]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[41]  Z. Harchaoui,et al.  Multiple Change-Point Estimation With a Total Variation Penalty , 2010 .

[42]  Po-Ling Loh,et al.  Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima , 2013, J. Mach. Learn. Res..

[43]  Alfred O. Hero,et al.  Semi-Supervised Learning via Sparse Label Propagation , 2016 .

[44]  Jan-Christian Hü,et al.  Optimal rates for total variation denoising , 2016, COLT.

[45]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[46]  Alexander Jung,et al.  When Is Network Lasso Accurate? , 2017, Front. Appl. Math. Stat..

[47]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[48]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[49]  James H. Garrett,et al.  Semi-Supervised Multiresolution Classification Using Adaptive Graph Filtering With Application to Indirect Bridge Structural Health Monitoring , 2014, IEEE Transactions on Signal Processing.

[50]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[51]  Abderrahim Elmoataz,et al.  Nonlocal Discrete Regularization on Weighted Graphs: A Framework for Image and Manifold Processing , 2008, IEEE Transactions on Image Processing.

[52]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[53]  Alexander Jung,et al.  The Network Nullspace Property for Compressed Sensing of Big Data Over Networks , 2017, Front. Appl. Math. Stat..

[54]  Yuantao Gu,et al.  The Convergence Guarantees of a Non-Convex Approach for Sparse Recovery , 2012, IEEE Transactions on Signal Processing.

[55]  Alessandro Rinaldo,et al.  Sparsistency of the Edge Lasso over Graphs , 2012, AISTATS.

[56]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[57]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[58]  R. Tibshirani Adaptive piecewise polynomial estimation via trend filtering , 2013, 1304.2986.

[59]  Yuejie Chi,et al.  Learning Latent Features with Pairwise Penalties in Matrix Completion , 2018, ArXiv.

[60]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[61]  Jie Chen,et al.  Theoretical Results on Sparse Representations of Multiple-Measurement Vectors , 2006, IEEE Transactions on Signal Processing.

[62]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[63]  José M. F. Moura,et al.  Signal Recovery on Graphs: Variation Minimization , 2014, IEEE Transactions on Signal Processing.

[64]  Jian Huang,et al.  A Selective Review of Group Selection in High-Dimensional Models. , 2012, Statistical science : a review journal of the Institute of Mathematical Statistics.

[65]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[66]  Koby Crammer,et al.  New Regularized Algorithms for Transductive Learning , 2009, ECML/PKDD.

[67]  Yonina C. Eldar,et al.  Block-Sparse Signals: Uncertainty Relations and Efficient Recovery , 2009, IEEE Transactions on Signal Processing.

[68]  Alessandro Rinaldo,et al.  A Sharp Error Analysis for the Fused Lasso, with Application to Approximate Changepoint Screening , 2017, NIPS.

[69]  Georgios B. Giannakis,et al.  Kernel-Based Reconstruction of Graph Signals , 2016, IEEE Transactions on Signal Processing.

[70]  Pierre Vandergheynst,et al.  Adaptive Graph-Based Total Variation for Tomographic Reconstructions , 2016, IEEE Signal Processing Letters.

[71]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .