Reconciliation of Statistical and Spatial Sparsity For Robust Image and Image-Set Classification

Recent image classification algorithms, by learning deep features from large-scale datasets, have achieved significantly better results comparing to the classic feature-based approaches. However, there are still various challenges of image classifications in practice, such as classifying noisy image or image-set queries, and training deep image classification models over the limited-scale dataset. Instead of applying generic deep features, the model-based approaches can be more effective and data-efficient for robust image and image-set classification tasks, as various image priors are exploited for modeling the interand intra-set data variations while preventing over-fitting. In this work, we propose a novel Joint Statistical and Spatial Sparse representation, dubbed J3S, to model the image or image-set data for classification, by reconciling both their local patch structures and global Gaussian distribution mapped into Riemannian manifold. To the best of our knowledge, no work to date utilized both global statistics and local patch structures jointly via joint sparse representation. We propose to solve the joint sparse coding problem based on the J3S model, by coupling the local and global image representations using joint sparsity. The learned J3S models are used for robust image and imageset classification. Experiments show that the proposed J3S-based image classification scheme outperforms the popular or state-ofthe-art competing methods over FMD, UIUC, ETH-80 and YTC databases.

[1]  Gongping Yang,et al.  Learning Deep Match Kernels for Image-Set Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jing Liu,et al.  Learning Low-rank Sparse Representations with Robust Relationship Inference for Image Memorability Prediction , 2020 .

[3]  Jinhui Tang,et al.  Patch-Set-Based Representation for Alignment-Free Image Set Classification , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Yanjun Li,et al.  A Set-Theoretic Study of the Relationships of Image Models and Priors for Restoration Problems , 2020, ArXiv.

[6]  Qilong Wang,et al.  From dictionary of visual words to subspaces: Locality-constrained affine subspace coding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[8]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Lei Zhang,et al.  RAID-G: Robust Estimation of Approximate Infinite Dimensional Gaussian with Application to Material Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Anoop Cherian,et al.  Riemannian Dictionary Learning and Sparse Coding for Positive Definite Matrices , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Liang-Tien Chia,et al.  Concurrent Single-Label Image Classification and Annotation via Efficient Multi-Layer Group Sparse Coding , 2014, IEEE Transactions on Multimedia.

[14]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Nicholas Ayache,et al.  Geometric Means in a Novel Vector Space Structure on Symmetric Positive-Definite Matrices , 2007, SIAM J. Matrix Anal. Appl..

[16]  Yicong Zhou,et al.  Kernel Combined Sparse Representation for Disease Recognition , 2016, IEEE Transactions on Multimedia.

[17]  Mehrtash Tafazzoli Harandi,et al.  From Manifold to Manifold: Geometry-Aware Dimensionality Reduction for SPD Matrices , 2014, ECCV.

[18]  Soo-Chang Pei,et al.  Feature-Based Sparse Representation for Image Similarity Assessment , 2011, IEEE Transactions on Multimedia.

[19]  Takumi Kobayashi,et al.  Dirichlet-Based Histogram Feature Transform for Image Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Yoram Bresler,et al.  FRIST—flipping and rotation invariant sparsifying transform learning and applications , 2015, ArXiv.

[21]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Shiguang Shan,et al.  Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification , 2015, ICML.

[23]  Josef Kittler,et al.  Graph Embedding Multi-Kernel Metric Learning for Image Set Classification With Grassmannian Manifold-Valued Features , 2020, IEEE Transactions on Multimedia.

[24]  Mohammed Bennamoun,et al.  Deep Reconstruction Models for Image Set Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Brian C. Lovell,et al.  Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices: A Kernel Approach , 2012, ECCV.

[26]  Larry S. Davis,et al.  Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Urs Niesen,et al.  Adaptive Alternating Minimization Algorithms , 2007, IEEE Transactions on Information Theory.

[28]  Fujiao Ju,et al.  Learning Adaptive Neighborhood Graph on Grassmann Manifolds for Video/Image-Set Subspace Clustering , 2021, IEEE Transactions on Multimedia.

[29]  Yoram Bresler,et al.  Structured Overcomplete Sparsifying Transform Learning with Convergence Guarantees and Applications , 2015, International Journal of Computer Vision.

[30]  Qinghua Hu,et al.  Towards Generalized and Efficient Metric Learning on Riemannian Manifold , 2018, IJCAI.

[31]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Zhiliang Zhu,et al.  Fast Single Image Super-Resolution via Self-Example Learning and Sparse Representation , 2014, IEEE Transactions on Multimedia.

[33]  Josef Kittler,et al.  Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Edward H. Adelson,et al.  Material perception: What can you see in a brief glance? , 2010 .

[35]  Brian C. Lovell,et al.  Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Lei Zhang,et al.  G2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yoram Bresler,et al.  $\ell_{0}$ Sparsifying Transform Learning With Efficient Optimal Updates and Convergence Guarantees , 2015, IEEE Transactions on Signal Processing.

[38]  Hao Cheng,et al.  Joint Statistical and Spatial Sparse Representation for Robust Image and Image-Set Classification , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[39]  Thanh Tuan Nguyen,et al.  Prominent Local Representation for Dynamic Textures Based on High-Order Gaussian-Gradients , 2021, IEEE Transactions on Multimedia.

[40]  Li Shuang,et al.  Hardness-Aware Dictionary Learning: Boosting Dictionary for Recognition , 2021, IEEE Transactions on Multimedia.

[41]  Weisi Lin,et al.  Image Sharpness Assessment by Sparse Representation , 2016, IEEE Transactions on Multimedia.

[42]  Qilong Wang,et al.  Is Second-Order Information Helpful for Large-Scale Visual Recognition? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[44]  Inderjit S. Dhillon,et al.  Low-Rank Kernel Learning with Bregman Matrix Divergences , 2009, J. Mach. Learn. Res..

[45]  Wen Gao,et al.  Manifold-Manifold Distance with application to face recognition based on image set , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[47]  Gang Wang,et al.  Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  David A. Forsyth,et al.  Non-parametric Filtering for Geometric Detail Extraction and Material Representation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.