Sparsity-Driven Laplacian-Regularized Outlier Identification for Dictionary Learning

Anomalies in data have traditionally been considered as nuisances whose presence, if ignored, can bring detrimental effects on the output of many data processing tasks. Nevertheless, in many situations anomalies correspond to events of interest and as such should be promptly identified before their presence is masked by the data preprocessing schemes being used to reduce the complexity of the main data processing task. This work develops a robust dictionary learning algorithm that exploits the notions of sparsity and local geometry of the data to identify anomalies while constructing sparse representations for the data. Sparsity is used to model the presence of anomalies in a dataset, and local geometry is exploited to better qualify a datum as an anomaly. The robust dictionary learning problem is cast as a regularized least-squares problem where sparsity-inducing and Laplacian regularization terms are used. Efficient iterative solvers based on block-coordinate descent and proximal gradient are developed to tackle the resulting joint dictionary learning and anomaly detection problems. The proposed framework is extended to address variations of classical dictionary learning and matrix factorization problems. Numerical tests on real datasets with artificial and real anomalies are used to illustrate the performance of the proposed algorithms.

[1]  D. Bertsekas 6.253 Convex Analysis and Optimization, Spring 2010 , 2004 .

[2]  Georgios B. Giannakis,et al.  Sparsity-Exploiting Robust Multidimensional Scaling , 2012, IEEE Transactions on Signal Processing.

[3]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[4]  V. P. Pauca,et al.  Nonnegative matrix factorization for spectral data analysis , 2006 .

[5]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[6]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[8]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[9]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[10]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[11]  C. Small A Survey of Multidimensional Medians , 1990 .

[12]  Jingdong Wang,et al.  Online Robust Non-negative Dictionary Learning for Visual Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[14]  Jean-Jacques Fuchs,et al.  An inverse problem approach to robust regression , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[16]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[17]  Joel A. Tropp,et al.  Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[18]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[19]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[20]  Gonzalo Mateos,et al.  Robust PCA as Bilinear Decomposition With Outlier-Sparsity Regularization , 2011, IEEE Transactions on Signal Processing.

[21]  Yousef Saad,et al.  Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection , 2009, J. Mach. Learn. Res..

[22]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[23]  Carl D. Meyer,et al.  Matrix Analysis and Applied Linear Algebra , 2000 .

[24]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[25]  Feiping Nie,et al.  Robust Dictionary Learning with Capped l1-Norm , 2015, IJCAI.

[26]  Josh Harguess,et al.  Structured outlier models for robust dictionary learning , 2015, 2015 49th Annual Conference on Information Sciences and Systems (CISS).

[27]  Sumedh Chandaluri Robust Dictionary Learning by Error Source Decomposition , 2018 .

[28]  Jordi Vitrià,et al.  Non-negative Matrix Factorization for Face Recognition , 2002, CCIA.

[29]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[30]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[31]  Georgios B. Giannakis,et al.  Robust Clustering Using Outlier-Sparsity Regularization , 2011, IEEE Transactions on Signal Processing.

[32]  Jian Huang,et al.  The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression. , 2011, Annals of statistics.

[33]  Julien Mairal,et al.  Proximal Methods for Hierarchical Sparse Coding , 2010, J. Mach. Learn. Res..

[34]  Ira Assent,et al.  Local Outlier Detection with Interpretation , 2013, ECML/PKDD.

[35]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[36]  Julien Mairal,et al.  Learning hierarchical and topographic dictionaries with structured sparsity , 2011, Optical Engineering + Applications.

[37]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[38]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[39]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[40]  P. Rousseeuw,et al.  Breakdown Points of Affine Equivariant Estimators of Multivariate Location and Covariance Matrices , 1991 .

[41]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[42]  Chris H. Q. Ding,et al.  Robust Non-Negative Dictionary Learning , 2014, AAAI.

[43]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[44]  Chandra Sekhar Seelamantula,et al.  ℓ1-K-SVD: A robust dictionary learning algorithm with simultaneous update , 2014, Signal Process..

[45]  Rajesh N. Davé,et al.  Robust clustering methods: a unified view , 1997, IEEE Trans. Fuzzy Syst..

[46]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[47]  Michael Elad,et al.  Dictionaries for Sparse Representation Modeling , 2010, Proceedings of the IEEE.