Embedded Deep Bilinear Interactive Information and Selective Fusion for Multi-view Learning

As a concrete application of multi-view learning, multi-view classification improves the traditional classification methods significantly by integrating various views optimally. Although most of the previous efforts have been demonstrated the superiority of multi-view learning, it can be further improved by comprehensively embedding more powerful cross-view interactive information and a more reliable multi-view fusion strategy in intensive studies. To fulfill this goal, we propose a novel multi-view learning framework to make the multi-view classification better aimed at the above-mentioned two aspects. That is, we seamlessly embed various intra-view information, cross-view multi-dimension bilinear interactive information, and a new view ensemble mechanism into a unified framework to make a decision via the optimization. In particular, we train different deep neural networks to learn various intra-view representations, and then dynamically learn multi-dimension bilinear interactive information from different bilinear similarities via the bilinear function between views. After that, we adaptively fuse the representations of multiple views by flexibly tuning the parameters of the view-weight, which not only avoids the trivial solution of weight but also provides a new way to select a few discriminative views that are beneficial to make a decision for the multi-view classification. Extensive experiments on six publicly available datasets demonstrate the effectiveness of the proposed method.

[1]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[2]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[3]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[4]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[6]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[9]  Guna Seetharaman,et al.  Multiview Boosting With Information Propagation for Classification , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Krista A. Ehinger,et al.  SUN Database: Exploring a Large Collection of Scene Categories , 2014, International Journal of Computer Vision.

[11]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[12]  Xin Yu,et al.  Object Tracking With Multi-View Support Vector Machines , 2015, IEEE Transactions on Multimedia.

[13]  Christoph H. Lampert,et al.  Correlational spectral clustering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[15]  Feiping Nie,et al.  Feature Selection via Scaling Factor Integrated Multi-Class Support Vector Machines , 2017, International Joint Conference on Artificial Intelligence.

[16]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Shiliang Sun,et al.  Multi-View Support Vector Machines with the Consensus and Complementarity Information , 2020, IEEE Transactions on Knowledge and Data Engineering.

[18]  Kiyoharu Aizawa,et al.  Category-Based Deep CCA for Fine-Grained Venue Discovery From Multimodal Data , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Gong Cheng,et al.  P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[21]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[22]  Josef Kittler,et al.  Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Gerhard Widmer,et al.  Deep Linear Discriminant Analysis , 2015, ICLR.

[24]  Jieping Ye,et al.  A least squares formulation for canonical correlation analysis , 2008, ICML '08.

[25]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[26]  Michael Isard,et al.  A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.

[27]  Qi Tian,et al.  Discriminant Learning Through Multiple Principal Angles for Visual Recognition , 2012, IEEE Transactions on Image Processing.

[28]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Bernhard Schölkopf,et al.  An improved training algorithm for kernel Fisher discriminants , 2001, AISTATS.

[30]  Joachim M. Buhmann,et al.  Correlated random features for fast semi-supervised learning , 2013, NIPS.

[31]  Jingjing Tang,et al.  Multiview Privileged Support Vector Machines , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[32]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Feiping Nie,et al.  Large-Scale Multi-View Spectral Clustering via Bipartite Graph , 2015, AAAI.

[35]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[36]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Changde Du,et al.  Reconstructing Perceived Images From Human Brain Activities With Bayesian Deep Multiview Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Shenghua Gao,et al.  Multiview Multitask Gaze Estimation With Deep Convolutional Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Colin Fyfe,et al.  A neural implementation of canonical correlation analysis , 1999, Neural Networks.

[40]  Shiguang Shan,et al.  Multi-view Deep Network for Cross-View Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Ruifan Li,et al.  Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.

[43]  Christos Boutsidis,et al.  Efficient Dimensionality Reduction for Canonical Correlation Analysis , 2012, SIAM J. Sci. Comput..

[44]  Dean P. Foster,et al.  Multi-View Learning of Word Embeddings via CCA , 2011, NIPS.

[45]  Hamid R. Rabiee,et al.  MDL-CW: A Multimodal Deep Learning Framework with CrossWeights , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Dong Xu,et al.  Learning Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection , 2019, IEEE Transactions on Image Processing.

[47]  Shiliang Sun,et al.  Multi-view learning overview: Recent progress and new challenges , 2017, Inf. Fusion.

[48]  Tom Diethe,et al.  Constructing Nonlinear Discriminants from Multiple Data Views , 2010, ECML/PKDD.

[49]  William W. Hsieh,et al.  Nonlinear canonical correlation analysis by neural networks , 2000, Neural Networks.

[50]  Ji Liu,et al.  Deep Embedded Complementary and Interactive Information for Multi-View Classification , 2020, AAAI.

[51]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[52]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[53]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Dean P. Foster,et al.  Using CCA to improve CCA: A new spectral method for estimating vector models of words , 2012, ICML.

[55]  Junwei Han,et al.  Multi-View Scaling Support Vector Machines for Classification and Feature Selection , 2020, IEEE Transactions on Knowledge and Data Engineering.

[56]  Yong Xu,et al.  Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[57]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[58]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.