论文信息 - ODMTCNet: An Interpretable Multi-view Deep Neural Network Architecture for Image Feature Representation

ODMTCNet: An Interpretable Multi-view Deep Neural Network Architecture for Image Feature Representation

Recently, deep cascade architecture-based algorithms have attracted wide interest and have been applied to various application domains successfully. However, the longstanding challenge of interpretability, is still considered as an Achilles’ heel of such algorithms. Moreover, due to its data-driven nature, the deep cascade architecture likely causes over-fitting problems when there is no sufficient data available. To address these pressing issues, this work proposes an interpretable multi-view deep neural network architecture, namely optimal discriminant multi-view tensor convolutional network (ODMTCNet), by integrating statistical machine learning (SML) principles with the deep neural network (DNN) architecture. Benefiting from the joint strength of SML and DNN, we demonstrate that ODMTCNet is analytically interpretable for multi-view image feature representation. Specifically, a discriminant multi-view tensor convolution strategy is proposed and integrated with the desired deep cascade architecture to generate high quality feature representations. Different from the traditional DNN models, the parameters of the convolutional layers in ODMTCNet are determined by analytically solving a SML–based optimization problem in each convolutional layer independently. This work demonstrates that, in ODMTCNet, the relation between the optimal performance and parameters (e.g., the number of convolutional filters) can be predicted, with each layer generating justified knowledge representations, leading to an interpretable multi-view based convolutional network. In addition, an information theoretic based descriptor, information quality (IQ), is utilized for feature representation of the given multi-view data sets. Because of its unique design, ODMTCNet is able to handle image data sets of different scales, large or small, effectively addressing the data hungry nature of DNN in image representation and forming a generic platform for multi-view image feature representation. To validate the effectiveness and the generic nature of the proposed ODMTCNet, we conducted experiments on four image data sets of different scales: The Olivetti Research Lab (ORL) database, Facial Recognition Technology (FERET) database, ETH–80 database and Caltech 256 database. The results show the superiority of the proposed solution compared to state-of-the-art.

[1] Xiaochun Cao,et al. Tensorized Multi-view Subspace Representation Learning , 2020, International Journal of Computer Vision.

[2] Jiwen Lu,et al. PCANet: A Simple Deep Learning Baseline for Image Classification? , 2014, IEEE Transactions on Image Processing.

[3] Huaijiang Sun,et al. Joint dimensionality reduction and metric learning for image set classification , 2020, Inf. Sci..

[4] 장윤희,et al. Y. , 2003, Industrial and Labor Relations Terms.

[5] Yimin Yang,et al. An Autuencoder-based Data Augmentation Strategy for Generalization Improvement of DCNNs , 2020, Neurocomputing.

[6] Zhenyu He,et al. Semi-Supervised Multi-View Deep Discriminant Representation Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Weiguo Fan,et al. Sketch-based image retrieval with deep visual semantic descriptor , 2018, Pattern Recognit..

[8] Huaijiang Sun,et al. Semi-supervised learning framework based on statistical analysis for image set classification , 2020, Pattern Recognit..

[9] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10] Qi Tian,et al. Bundled Local Features for Image Representation , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[11] Lahcen Koutti,et al. A robust approach for object matching and classification using Partial Dominant Orientation Descriptor , 2017, Pattern Recognit..

[12] Pan Zhou,et al. Bilevel Model-Based Discriminative Dictionary Learning for Recognition , 2017, IEEE Transactions on Image Processing.

[13] Marcel Simon,et al. Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14] W. Hager,et al. and s , 2019, Shallow Water Hydraulics.

[15] Lei Gao,et al. A Complete Discriminative Tensor Representation Learning for Two-Dimensional Correlation Analysis , 2020, IEEE Signal Processing Letters.

[16] Bo Yang,et al. A comparative study on local binary pattern (LBP) based face recognition: LBP histogram versus LBP image , 2013, Neurocomputing.

[17] Liqiang Nie,et al. Low-rank regularized tensor discriminant representation for image set classification , 2019, Signal Process..

[18] Mohammed Bennamoun,et al. Resfeats: Residual network based features for image classification , 2016, 2017 IEEE International Conference on Image Processing (ICIP).

[19] Qi Tian,et al. Structured Weak Semantic Space Construction for Visual Categorization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[20] P. Alam. ‘W’ , 2021, Composites Engineering.

[21] Athanasios V. Vasilakos,et al. Machine learning on big data: Opportunities and challenges , 2017, Neurocomputing.

[22] Quanshi Zhang,et al. Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23] P. Alam. ‘S’ , 2021, Composites Engineering: An A–Z Guide.

[24] David Zhang,et al. Sparse, Collaborative, or Nonnegative Representation: Which Helps Pattern Classification? , 2018, Pattern Recognition.

[25] Kazuhiro Fukui,et al. Metric Learning with A-based Scalar Product for Image-set Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26] Neil Genzlinger. A. and Q , 2006 .

[27] Andy Harter,et al. Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[28] Hyeonjoon Moon,et al. The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29] Taghi M. Khoshgoftaar,et al. Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[30] Caifeng Shan,et al. Deep Salient Object Detection With Contextual Information Guidance , 2020, IEEE Transactions on Image Processing.

[31] Rong Wang,et al. Stable and orthogonal local discriminant embedding using trace ratio criterion for dimensionality reduction , 2018, Multimedia Tools and Applications.

[32] V. Tikhomirov. On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of a Smaller Number of Variables , 1991 .

[33] Weifeng Liu,et al. Canonical correlation analysis networks for two-view image recognition , 2017, Inf. Sci..

[34] Lei Gao,et al. Discriminative Multiple Canonical Correlation Analysis for Information Fusion , 2018, IEEE Transactions on Image Processing.

[35] Yan Liu,et al. A new method of feature fusion and its application in image recognition , 2005, Pattern Recognit..

[36] Brian C. Lovell,et al. Convex Class Model on Symmetric Positive Definite Manifolds , 2018, Image Vis. Comput..

[37] Yao Guo,et al. TransTailor: Pruning the Pre-trained Model for Improved Transfer Learning , 2021, AAAI.

[38] R. Sarpong,et al. Bio-inspired synthesis of xishacorenes A, B, and C, and a new congener from fuscol† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c9sc02572c , 2019, Chemical science.

[39] Gorjan Alagic,et al. #p , 2019, Quantum information & computation.

[40] Qi Tian,et al. Image Class Prediction by Joint Object, Context, and Background Modeling , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[41] Rejeesh M R. Interest point based face recognition using adaptive neuro fuzzy inference system , 2019, Multimedia Tools and Applications.

[42] Xuyu Peng,et al. A hybrid improved kernel LDA and PNN algorithm for efficient face recognition , 2020, Neurocomputing.

[43] Meng Wang,et al. Kernel-Induced Label Propagation by Mapping for Semi-Supervised Classification , 2019, IEEE Transactions on Big Data.

[44] Wei Wei,et al. Exponential sparsity preserving projection with applications to image recognition , 2020, Pattern Recognit..

[45] Weifeng Liu,et al. Multiview dimension reduction via Hessian multiset canonical correlations , 2018, Inf. Fusion.

[46] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47] Vince D. Calhoun,et al. Joint Blind Source Separation by Multiset Canonical Correlation Analysis , 2009, IEEE Transactions on Signal Processing.

[48] Richang Hong,et al. Robust Subspace Discovery by Block-diagonal Adaptive Locality-constrained Representation , 2019, ACM Multimedia.

[49] Lei Gao,et al. The Labeled Multiple Canonical Correlation Analysis for Information Fusion , 2019, IEEE Transactions on Multimedia.

[50] Jian Yang,et al. A New Discriminative Sparse Representation Method for Robust Face Recognition via $l_{2}$ Regularization , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[51] Jieping Ye,et al. Two-Dimensional Linear Discriminant Analysis , 2004, NIPS.

[52] Zhenwen Ren,et al. Multiple kernel dimensionality reduction based on collaborative representation for set oriented image classification , 2019, Expert Syst. Appl..

[53] Chengjun Liu,et al. A Novel Locally Linear KNN Method With Applications to Visual Recognition , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[54] Atsuto Maki,et al. Regularizing CNN Transfer Learning With Randomised Regression , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55] S. Thorpe,et al. STDP-based spiking deep convolutional neural networks for object recognition , 2018 .

[56] Bolei Zhou,et al. Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57] Quansen Sun,et al. Two-Directional Two-Dimensional Kernel Canonical Correlation Analysis , 2019, IEEE Signal Processing Letters.

[58] Zhihui Lai,et al. Generalized Discriminant Local Median Preserving Projections (GDLMPP) for Face Recognition , 2018, Neural Processing Letters.

[59] Fan Zhao,et al. Dynamic graph fusion label propagation for semi-supervised multi-modality classification , 2017, Pattern Recognit..

[60] Ah Chung Tsoi,et al. Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[61] Paul Lukowicz,et al. Discriminative feature generation for classification of imbalanced data , 2020, Pattern Recognit..

[62] Kazuhiro Fukui,et al. A Method Based on Convex Cone Model for Image-Set Classification With CNN Features , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[63] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64] Christine Guillemot,et al. A study of the classification of low-dimensional data with supervised manifold learning , 2015, J. Mach. Learn. Res..

[65] Tsuyoshi Murata,et al. {m , 1934, ACML.

[66] Baocai Yin,et al. Maximally Correlated Principal Component Analysis Based on Deep Parameterization Learning , 2019, ACM Trans. Knowl. Discov. Data.

[67] Jian Zhang,et al. Convolutional Sparse Autoencoders for Image Classification , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[68] Andrew Beng Jin Teoh,et al. DCTNet: A simple learning-free approach for face recognition , 2015, 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[69] Chengjun Liu,et al. A Sparse Representation Model Using the Complete Marginal Fisher Analysis Framework and Its Applications to Visual Recognition , 2017, IEEE Transactions on Multimedia.

[70] Li Zhang,et al. Discriminative Local Sparse Representation by Robust Adaptive Dictionary Pair Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[71] Qi Tian,et al. Multiview Label Sharing for Visual Representations and Classifications , 2018, IEEE Transactions on Multimedia.

[72] Jianping Fan,et al. Hierarchical learning of multi-task sparse metrics for large-scale image classification , 2017, Pattern Recognit..

[73] Seungjin Choi,et al. Two-Dimensional Canonical Correlation Analysis , 2007, IEEE Signal Processing Letters.

[74] Renu M. Rameshan,et al. Image Set Classification Using a Distance-Based Kernel Over Affine Grassmann Manifold , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[75] Dapeng Tao,et al. Constrained Discriminative Projection Learning for Image Classification , 2020, IEEE Transactions on Image Processing.

[76] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[77] Eulanda M. dos Santos,et al. Discriminative canonical correlation analysis network for image classification , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[78] Haixian Wang,et al. Local Two-Dimensional Canonical Correlation Analysis , 2010, IEEE Signal Processing Letters.

[79] Junyu Dong,et al. An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning , 2016, ArXiv.

[80] P. Alam. ‘A’ , 2021, Composites Engineering: An A–Z Guide.

[81] Wing W. Y. Ng,et al. LiSSA: Localized Stochastic Sensitive Autoencoders , 2021, IEEE Transactions on Cybernetics.

[82] Bernt Schiele,et al. Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[83] Kezhi Mao,et al. Learning Semantic Text Features for Web Text-Aided Image Classification , 2019, IEEE Transactions on Multimedia.

[84] Min Zhang,et al. Robust Triple-Matrix-Recovery-Based Auto-Weighted Label Propagation for Classification , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[85] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[86] Aaas News,et al. Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[87] Pascal Poupart,et al. Representation Learning for Dynamic Graphs: A Survey , 2020, J. Mach. Learn. Res..

[88] Jing Zhang,et al. Tensor-driven low-rank discriminant analysis for image set classification , 2017, Multimedia Tools and Applications.

[89] Yizhou Yu,et al. Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-Tuning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[90] Yong Xu,et al. Multi-resolution dictionary learning for face recognition , 2019, Pattern Recognit..

[91] C.-C. Jay Kuo,et al. Interpretable Convolutional Neural Networks via Feedforward Design , 2018, J. Vis. Commun. Image Represent..

[92] Junbin Gao,et al. Solving Partial Least Squares Regression via Manifold Optimization Approaches , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[93] Lei Zhu,et al. Weighted locality collaborative representation based on sparse subspace , 2019, J. Vis. Commun. Image Represent..

[94] Qi Tian,et al. Image-Specific Classification With Local and Global Discriminations , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[95] Xi Wu,et al. A structure-time parallel implementation of spike-based deep learning , 2019, Neural Networks.

[96] Yong Xu,et al. Combining dissimilarity measures for image classification , 2019, Pattern Recognit. Lett..

[97] Liangchen Liu,et al. Multi-task image set classification via joint representation with class-level sparsity and intra-task low-rankness , 2020, Pattern Recognit. Lett..

[98] Jianping Fan,et al. A generalized least-squares approach regularized with graph embedding for dimensionality reduction , 2020, Pattern Recognit..

[99] Yuxiao Hu,et al. Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[100] G. Griffin,et al. Caltech-256 Object Category Dataset , 2007 .

[101] W. Marsden. I and J , 2012 .

[102] Jianqiang Gao,et al. Multi-model fusion metric learning for image set classification , 2019, Knowl. Based Syst..

[103] Quan-Sen Sun,et al. Laplacian multiset canonical correlations for multiview feature extraction and image recognition , 2015, Multimedia Tools and Applications.

[104] Jian Yang,et al. Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.