Stochastic Recursive Gradient Support Pursuit and Its Sparse Representation Applications

In recent years, a series of matching pursuit and hard thresholding algorithms have been proposed to solve the sparse representation problem with ℓ0-norm constraint. In addition, some stochastic hard thresholding methods were also proposed, such as stochastic gradient hard thresholding (SG-HT) and stochastic variance reduced gradient hard thresholding (SVRGHT). However, each iteration of all the algorithms requires one hard thresholding operation, which leads to a high per-iteration complexity and slow convergence, especially for high-dimensional problems. To address this issue, we propose a new stochastic recursive gradient support pursuit (SRGSP) algorithm, in which only one hard thresholding operation is required in each outer-iteration. Thus, SRGSP has a significantly lower computational complexity than existing methods such as SG-HT and SVRGHT. Moreover, we also provide the convergence analysis of SRGSP, which shows that SRGSP attains a linear convergence rate. Our experimental results on large-scale synthetic and real-world datasets verify that SRGSP outperforms state-of-the-art related methods for tackling various sparse representation problems. Moreover, we conduct many experiments on two real-world sparse representation applications such as image denoising and face recognition, and all the results also validate that our SRGSP algorithm obtains much better performance than other sparse representation learning optimization methods in terms of PSNR and recognition rates.

[1]  Deanna Needell,et al.  Linear Convergence of Stochastic Iterative Greedy Algorithms With Sparse Constraints , 2014, IEEE Transactions on Information Theory.

[2]  Jinghui Chen,et al.  Fast Newton Hard Thresholding Pursuit for Sparsity Constrained Nonconvex Optimization , 2017, KDD.

[3]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[4]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[5]  Pan Zhou,et al.  Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Høgskolen i Stavanger FRAME DESIGN USING FOCUSS WITH METHOD OF OPTIMAL DIRECTIONS (MOD) , 2000 .

[7]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[8]  Jie Zhao,et al.  SAR Image Denoising via Sparse Representation in Shearlet Domain Based on Continuous Cycle Spinning , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[9]  David Zhang,et al.  A Survey of Sparse Representation: Algorithms and Applications , 2015, IEEE Access.

[10]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[11]  Hong Cheng,et al.  Generalized Higher Order Orthogonal Iteration for Tensor Learning and Decomposition , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Olgica Milenkovic,et al.  Subspace Pursuit for Compressive Sensing Signal Reconstruction , 2008, IEEE Transactions on Information Theory.

[13]  Jinghui Chen,et al.  Accelerated Stochastic Block Coordinate Gradient Descent for Sparsity Constrained Nonconvex Optimization , 2016, UAI.

[14]  Heng Huang,et al.  Stochastic Second-Order Method for Large-Scale Nonconvex Sparse Learning Models , 2018, IJCAI.

[15]  Jianqin Zhou,et al.  On discrete cosine transform , 2011, ArXiv.

[16]  Jinbo Bi,et al.  An Effective Hard Thresholding Method Based on Stochastic Variance Reduction for Nonconvex Sparse Learning , 2020, AAAI.

[17]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[18]  L. Shao,et al.  From Heuristic Optimization to Dictionary Learning: A Review and Comprehensive Comparison of Image Denoising Algorithms , 2014, IEEE Transactions on Cybernetics.

[19]  Ruomei Yan,et al.  Improved Nonlocal Means Based on Pre-Classification and Invariant Block Matching , 2012, Journal of Display Technology.

[20]  Xiaofei Zhang,et al.  Hyperspectral Image Classification via Fusing Correlation Coefficient and Joint Sparse Representation , 2018, IEEE Geoscience and Remote Sensing Letters.

[21]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[22]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[23]  Deanna Needell,et al.  Signal Recovery From Incomplete and Inaccurate Measurements Via Regularized Orthogonal Matching Pursuit , 2007, IEEE Journal of Selected Topics in Signal Processing.

[24]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[25]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Lijun Zhang,et al.  VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning , 2018, IEEE Transactions on Knowledge and Data Engineering.

[27]  Tuo Zhao,et al.  Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning , 2016, ICML.

[28]  YUN-BIN ZHAO,et al.  Optimal k-thresholding algorithms for sparse optimization problems , 2019, SIAM J. Optim..

[29]  Ping Li,et al.  A Tight Bound of Hard Thresholding , 2016, J. Mach. Learn. Res..

[30]  Onur G. Guleryuz Nonlinear approximation based image recovery using adaptive sparse reconstructions , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[31]  Guillermo Sapiro,et al.  Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[32]  Bhiksha Raj,et al.  Greedy sparsity-constrained optimization , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[33]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[34]  Deanna Needell,et al.  CoSaMP: Iterative signal recovery from incomplete and inaccurate samples , 2008, ArXiv.

[35]  Xiangru Lian,et al.  Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent , 2019, NeurIPS.

[36]  Fanhua Shang,et al.  A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates , 2018, ICML.

[37]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[38]  Zhouchen Lin,et al.  Accelerated Variance Reduction Stochastic ADMM for Large-Scale Machine Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Zhi-Quan Luo,et al.  Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Hong Cheng,et al.  Generalized Higher-Order Orthogonal Iteration for Tensor Decomposition and Completion , 2014, NIPS.

[41]  Michael Elad,et al.  Image Denoising Via Learned Dictionaries and Sparse representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Mark W. Schmidt,et al.  Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.

[44]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[45]  Chong Wang,et al.  Speckle Suppression Based on Sparse Representation with Non-Local Priors , 2018, Remote. Sens..

[46]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[47]  Yan Ren,et al.  ASVRG: Accelerated Proximal SVRG , 2018, ACML.

[48]  Yuanyuan Liu,et al.  Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning , 2017, ArXiv.

[49]  Jie Liu,et al.  SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.

[50]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[51]  Fanhua Shang,et al.  Loopless Semi-Stochastic Gradient Descent with Less Hard Thresholding for Sparse Learning , 2019, CIKM.

[52]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[53]  Xiao-Tong Yuan,et al.  Gradient Hard Thresholding Pursuit for Sparsity-Constrained Optimization , 2013, ICML.