Auto-encoder based structured dictionary learning for visual classification

Abstract Dictionary learning and deep learning can be combined to boost the performance of classification tasks. However, existing combined methods often learn multi-level dictionaries each of which is embedded in a network layer, involve a large number of parameters (elements of many dictionaries) and thus easily result in prohibitive computational cost and even overfitting. In this paper, we present a novel deep Auto-Encoder based Structured Dictionary (AESD) learning model, where we need to learn only one dictionary which is composed of class-specific sub-dictionaries, and supervision is introduced by imposing discriminative category constraints to empower the dictionary with discrimination. The encoding layers are designed with shared parameters which are exactly dependent on the dictionary carried by the decoding layer. This characterizes the learning process by forward-propagation based optimization w.r.t the dictionary only, leading to a light-weight network training. In addition to utilizing directly the trained encoding network combined with a minimum-reconstruction-residual scheme for single image based classification, to expand the application spectrum of our method, in the testing phase, we extend the proposed prototype into a Convolutional Encoder based Block Sparse Representation (CEBSR) model to promote the latent block sparsity in the joint representation of an image set, achieving improved image set based classification. Extensive experiments verify the performance of the learned dictionary for image classification, and the superiority of our extended model over the state-of-the-art image set classification methods.

[1]  Liyi Dai,et al.  Deep Dictionary Learning: A PARametric NETwork Approach , 2018, IEEE Transactions on Image Processing.

[2]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Michael Elad,et al.  A Local Block Coordinate Descent Algorithm for the CSC Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Larry S. Davis,et al.  Label Consistent K-SVD: Learning a Discriminative Dictionary for Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Mohammed Bennamoun,et al.  Empowering Simple Binary Classifiers for Image Set Based Face Recognition , 2017, International Journal of Computer Vision.

[7]  Rama Chellappa,et al.  Moving vistas: Exploiting motion for describing scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Brian C. Lovell,et al.  Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Jian Yu,et al.  Group Collaborative Representation for Image Set Classification , 2019, International Journal of Computer Vision.

[11]  Mayank Vatsa,et al.  Deep Dictionary Learning , 2016, IEEE Access.

[12]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Jianping Fan,et al.  Image collection summarization via dictionary learning for sparse representation , 2013, Pattern Recognit..

[14]  Xiaochun Cao,et al.  Enhancing Sketch-Based Image Retrieval by CNN Semantic Re-ranking , 2020, IEEE Transactions on Cybernetics.

[15]  Baoxin Li,et al.  Discriminative K-SVD for dictionary learning in face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Mohammad Mehdi Ebadzadeh,et al.  Dictionary learning enhancement framework: Learning a non-linear mapping model to enhance discriminative dictionary learning methods , 2019, Neurocomputing.

[17]  Guangcan Liu,et al.  Learning Structured Twin-Incoherent Twin-Projective Latent Dictionary Pairs for Classification , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[18]  Simon C. K. Shiu,et al.  Image Set-Based Collaborative Representation for Face Recognition , 2013, IEEE Transactions on Information Forensics and Security.

[19]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[20]  Mehrtash Tafazzoli Harandi,et al.  Beyond Gauss: Image-Set Matching on the Riemannian Manifold of PDFs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Lin Wu,et al.  Exploiting Attribute Correlations: A Novel Trace Lasso-Based Weakly Supervised Dictionary Learning Method , 2017, IEEE Transactions on Cybernetics.

[22]  Guoqiang Zhang,et al.  Online multi-layer dictionary pair learning for visual classification , 2018, Expert Syst. Appl..

[23]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Yang Wang,et al.  Twin-Incoherent Self-Expressive Locality-Adaptive Latent Dictionary Pair Learning for Classification , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Shuicheng Yan,et al.  Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Rongrong Ji,et al.  HRank: Filter Pruning Using High-Rank Feature Map , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[28]  Zhao Zhang,et al.  Joint Subspace Recovery and Enhanced Locality Driven Robust Flexible Discriminative Dictionary Learning , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  David Zhang,et al.  Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification , 2014, International Journal of Computer Vision.

[30]  Guangming Shi,et al.  Multi-layer discriminative dictionary learning with locality constraint for image classification , 2019, Pattern Recognit..

[31]  Junwei Han,et al.  Duplex Metric Learning for Image Set Classification. , 2018, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[32]  Zhong-Qiu Zhao,et al.  A review of image set classification , 2019, Neurocomputing.

[33]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[34]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[35]  Zhiming Zhang,et al.  Exploring Inter-Instance Relationships within the Query Set for Robust Image Set Matching , 2019, Sensors.

[36]  Xindong Wu,et al.  A set-level joint sparse representation for image set classification , 2018, Inf. Sci..

[37]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[38]  Ronald Davis,et al.  Neural networks and deep learning , 2017 .

[39]  Pier Luigi Dragotti,et al.  A Deep Dictionary Model to Preserve and Disentangle Key Features in a Signal , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  Mohamed S. Kamel,et al.  Supervised Dictionary Learning and Sparse Representation-A Review , 2015, ArXiv.

[41]  Shiguang Shan,et al.  Prototype Discriminative Learning for Image Set Classification , 2017, IEEE Signal Processing Letters.

[42]  Nicu Sebe,et al.  When Dictionary Learning Meets Deep Learning: Deep Dictionary Learning and Coding Network for Image Recognition With Limited Data , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[43]  Xiang Li,et al.  Deep Neural Network Structured Sparse Coding for Online Processing , 2018, IEEE Access.

[44]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[45]  Mohammed Bennamoun,et al.  Deep Reconstruction Models for Image Set Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Liangchen Liu,et al.  Multi-task image set classification via joint representation with class-level sparsity and intra-task low-rankness , 2020, Pattern Recognit. Lett..

[47]  Ling Shao,et al.  Few-Shot Deep Adversarial Learning for Video-Based Person Re-Identification , 2019, IEEE Transactions on Image Processing.

[48]  Guangcan Liu,et al.  Scalable Block-Diagonal Locality-Constrained Projective Dictionary Learning , 2019, IJCAI.

[49]  Ajmal S. Mian,et al.  Face Recognition Using Sparse Approximated Nearest Points between Image Sets , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.