A Tensor-Train Dictionary Learning algorithm based on Spectral Proximal Alternating Linearized Minimization

Dictionary Learning (DL) is one of the leading sparsity promoting techniques in the context of image classification, where the “dictionary” matrix D of images and the sparse matrix X are determined so as to represent a redundant image dataset. The resulting constrained optimization problem is nonconvex and non-smooth, providing several computational challenges for its solution. To preserve multidimensional data features, various tensor DL formulations have been introduced, adding to the problem complexity. We propose a new tensor formulation of the DL problem using a Tensor-Train decomposition of the multi-dimensional dictionary, together with a new alternating algorithm for its solution. The new method belongs to the Proximal Alternating Linearized Minimization (PALM) algorithmic family, with the inclusion of second order information to enhance efficiency. We discuss a rigorous convergence analysis, and report on the new method performance on the image classification of several benchmark datasets.

[1]  Amir Beck,et al.  First-Order Methods in Optimization , 2017 .

[2]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Christian Kanzow,et al.  Globalized inexact proximal Newton-type methods for nonconvex composite functions , 2021, Comput. Optim. Appl..

[4]  Valeria Ruggiero,et al.  On the steplength selection in gradient methods for unconstrained optimization , 2018, Appl. Math. Comput..

[5]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[6]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[7]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[8]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Syed Zubair,et al.  Tensor dictionary learning with sparse TUCKER decomposition , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[11]  Hong Zhu,et al.  Structured Dictionary Learning for Image Denoising Under Mixed Gaussian and Impulse Noise , 2020, IEEE Transactions on Image Processing.

[12]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[13]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[14]  José Mario Martínez,et al.  Spectral Projected Gradient Methods: Review and Perspectives , 2014 .

[15]  S. Bonettini Inexact block coordinate descent methods with application to non-negative matrix factorization , 2011 .

[16]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[17]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[18]  Shengli Xie,et al.  Proximal Alternating Minimization for Analysis Dictionary Learning and Convergence Analysis , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[19]  Michael Elad,et al.  Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit , 2008 .

[20]  Federica Porta,et al.  Variable Metric Inexact Line-Search-Based Methods for Nonsmooth Optimization , 2015, SIAM J. Optim..

[21]  Zuowei Shen,et al.  L0 Norm Based Dictionary Learning by Proximal Methods with Global Convergence , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Volker Blanz,et al.  Component-Based Face Recognition with 3D Morphable Models , 2003, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[23]  Xingju Cai,et al.  A Gauss–Seidel type inertial proximal alternating linearized minimization for a class of nonconvex optimization problems , 2020, J. Glob. Optim..

[24]  Yen-Wei Chen,et al.  K-CPD: Learning of overcomplete dictionaries for tensor sparse coding , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[25]  Andrzej Cichocki,et al.  Era of Big Data Processing: A New Approach via Tensor Networks and Tensor Decompositions , 2014, ArXiv.

[26]  Bruno Iannazzo,et al.  The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation , 2018 .

[27]  Enrico Meli,et al.  Solving Nonlinear Systems of Equations Via Spectral Residual Methods: Stepsize Selection and Applications , 2020, Journal of Scientific Computing.

[28]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[29]  Zemin Zhang,et al.  Denoising and Completion of 3D Data via Multidimensional Dictionary Learning , 2015, IJCAI.

[30]  Florian Roemer,et al.  Tensor-based algorithms for learning multidimensional separable dictionaries , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  José Mario Martínez,et al.  Nonmonotone Spectral Projected Gradient Methods on Convex Sets , 1999, SIAM J. Optim..

[32]  Luigi Grippo,et al.  Nonmonotone derivative-free methods for nonlinear equations , 2007, Comput. Optim. Appl..

[33]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[34]  Rémi Gribonval,et al.  Learning Tensor-structured Dictionaries with Application to Hyperspectral Image Denoising , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[35]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[36]  Madeleine Udell,et al.  The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM , 2016, NIPS.

[37]  Duy Nhat Phan,et al.  An Inertial Block Majorization Minimization Framework for Nonsmooth Nonconvex Optimization , 2020, J. Mach. Learn. Res..

[38]  Luigi Grippo,et al.  On the convergence of the block nonlinear Gauss-Seidel method under convex constraints , 2000, Oper. Res. Lett..

[39]  Bogdan Dumitrescu,et al.  Dictionary Learning Algorithms and Applications , 2018 .

[40]  Jicong Fan,et al.  Robust Non-Linear Matrix Factorization for Dictionary Learning, Denoising, and Clustering , 2020, ArXiv.

[41]  Jean Ponce,et al.  Sparse Modeling for Image and Vision Processing , 2014, Found. Trends Comput. Graph. Vis..

[42]  Thomas Pock,et al.  Inertial Proximal Alternating Linearized Minimization (iPALM) for Nonconvex and Nonsmooth Problems , 2016, SIAM J. Imaging Sci..

[43]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .