Deep Embedded Complementary and Interactive Information for Multi-View Classification

Multi-view classification optimally integrates various features from different views to improve classification tasks. Though most of the existing works demonstrate promising performance in various computer vision applications, we observe that they can be further improved by sufficiently utilizing complementary view-specific information, deep interactive information between different views, and the strategy of fusing various views. In this work, we propose a novel multi-view learning framework that seamlessly embeds various view-specific information and deep interactive information and introduces a novel multi-view fusion strategy to make a joint decision during the optimization for classification. Specifically, we utilize different deep neural networks to learn multiple view-specific representations, and model deep interactive information through a shared interactive network using the cross-correlations between attributes of these representations. After that, we adaptively integrate multiple neural networks by flexibly tuning the power exponent of weight, which not only avoids the trivial solution of weight but also provides a new approach to fuse outputs from different deterministic neural networks. Extensive experiments on several public datasets demonstrate the rationality and effectiveness of our method.

[1]  Shotaro Akaho,et al.  A kernel method for canonical correlation analysis , 2006, ArXiv.

[2]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hidetoshi Shimodaira,et al.  A probabilistic framework for multi-view feature learning with many-to-many associations via neural networks , 2018, ICML.

[4]  Silvio Savarese,et al.  Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[6]  Krzysztof Walas,et al.  A hierarchical approach for joint multi-view object pose estimation and categorization , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  J. Shawe-Taylor,et al.  Multi-View Canonical Correlation Analysis , 2010 .

[9]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[10]  Li-Jia Li,et al.  Multi-view Face Detection Using Deep Convolutional Neural Networks , 2015, ICMR.

[11]  Juho Rousu,et al.  Large-Scale Sparse Kernel Canonical Correlation Analysis , 2019, ICML.

[12]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Multi-View K-Means Clustering on Big Data , 2022 .

[13]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Janaina Mourão Miranda,et al.  Unsupervised analysis of fMRI data using kernel canonical correlation , 2007, NeuroImage.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Christoph H. Lampert,et al.  Correlational spectral clustering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Feiping Nie,et al.  Large-Scale Multi-View Spectral Clustering via Bipartite Graph , 2015, AAAI.

[20]  Shiguang Shan,et al.  Multi-view Deep Network for Cross-View Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Yang Gao,et al.  Compact Bilinear Pooling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[27]  Silvio Savarese,et al.  Multi-view Object Categorization and Pose Estimation , 2010, Computer Vision: Detection, Recognition and Reconstruction.

[28]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[29]  Shiliang Sun,et al.  Multi-view uncorrelated linear discriminant analysis with applications to handwritten digit recognition , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[30]  Gerhard Widmer,et al.  Deep Linear Discriminant Analysis , 2015, ICLR.

[31]  Yasuyuki Matsushita,et al.  RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Andrew Zisserman,et al.  Fisher Vector Faces in the Wild , 2013, BMVC.

[33]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[34]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[35]  Silvio Savarese,et al.  A multi-view probabilistic model for 3D object classes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Jefersson Alex dos Santos,et al.  Towards better exploiting convolutional neural networks for remote sensing scene classification , 2016, Pattern Recognit..

[37]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.