Multiview Machine Learning

Semi-supervised learning is concernedwith such learning scenarioswhere only a small portion of training data are labeled. In multiview settings, unlabeled data can be used to regularize the prediction functions, and thus to reduce the search space. In this chapter, we introduce two categories of multiview semi-supervised learning methods. The first one contains the co-training style methods, where the prediction functions from different views are trained through their own objective, and each prediction function is improved by the others. The second one contains the co-regularization style methods, where a single objective function exists for the prediction functions from different views to be trained simultaneously.

[1]  Yale Song,et al.  Multi-view latent variable discriminative models for action recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Jinbo Bi,et al.  Multi-view Sparse Co-clustering via Proximal Alternating Linearized Minimization , 2015, ICML.

[3]  Philip S. Yu,et al.  A General Model for Multiple View Unsupervised Learning , 2008, SDM.

[4]  Subhashini Venugopalan,et al.  Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2014, NAACL.

[5]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[6]  Xiaojie Wang,et al.  Correspondence Autoencoders for Cross-Modal Retrieval , 2015, ACM Trans. Multim. Comput. Commun. Appl..

[7]  Ruslan Salakhutdinov,et al.  Adaptive Overrelaxed Bound Optimization Methods , 2003, ICML.

[8]  Geoffrey E. Hinton Deep belief networks , 2009, Scholarpedia.

[9]  Hui Xiong,et al.  Multi-task Multi-view Learning for Heterogeneous Tasks , 2014, CIKM.

[10]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[11]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[12]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[13]  Nello Cristianini,et al.  Composite Kernels for Hypertext Categorisation , 2001, ICML.

[14]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Multi-View K-Means Clustering on Big Data , 2022 .

[15]  Shiliang Sun,et al.  Multi-view Transfer Learning with Adaboost , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[16]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[17]  Zhi-Hua Zhou,et al.  Exploiting Unlabeled Data in Content-Based Image Retrieval , 2004, ECML.

[18]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[19]  Geoffrey Zweig,et al.  From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Louis-Philippe Morency,et al.  Deep multimodal fusion for persuasiveness prediction , 2016, ICMI.

[21]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[22]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[23]  Ruslan Salakhutdinov,et al.  Generating Images from Captions with Attention , 2015, ICLR.

[24]  Rabab Kreidieh Ward,et al.  Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[25]  Xianchao Zhang,et al.  Multi-Task Multi-View Clustering for Non-Negative Data , 2015, IJCAI.

[26]  Shiliang Sun,et al.  Sparse Multimodal Gaussian Processes , 2017, IScIDE.

[27]  Honglak Lee,et al.  Deep learning for robust feature generation in audiovisual emotion recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[29]  Krystian Mikolajczyk,et al.  Deep correlation for matching images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Mohamed R. Amer,et al.  Facial Attributes Classification Using Multi-task Representation Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[32]  Maria-Florina Balcan,et al.  Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.

[33]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[34]  Ruslan Salakhutdinov,et al.  Multimodal Neural Language Models , 2014, ICML.

[35]  Feiping Nie,et al.  Heterogeneous Visual Features Fusion via Sparse Multimodal Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Wei Xu,et al.  Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.

[37]  Jing Huang,et al.  Audio-visual deep learning for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[38]  Ruslan Salakhutdinov,et al.  Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.

[39]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[40]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[41]  Tom Diethe,et al.  Multiview Fisher Discriminant Analysis , 2008 .

[42]  Shiliang Sun,et al.  Multi-view Deep Gaussian Processes , 2018, ICONIP.

[43]  Shiliang Sun,et al.  Multi-source Transfer Learning with Multi-view Adaboost , 2012, ICONIP.

[44]  Shiliang Sun,et al.  Multi-View Maximum Entropy Discrimination , 2013, IJCAI.

[45]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[46]  Trevor Darrell,et al.  Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[47]  Shiliang Sun,et al.  Nonlinear Combination of Multiple Kernels for Support Vector Machines , 2010, 2010 20th International Conference on Pattern Recognition.

[48]  Hugo Larochelle,et al.  Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.

[49]  Jason Weston,et al.  WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[50]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[51]  Feiping Nie,et al.  Multi-View Clustering and Feature Learning via Structured Sparsity , 2013, ICML.

[52]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[53]  Shiliang Sun,et al.  Manifold-preserving graph reduction for sparse semi-supervised learning , 2014, Neurocomputing.

[54]  Kevin Gimpel,et al.  Deep Multilingual Correlation for Improved Word Embeddings , 2015, HLT-NAACL.

[55]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[56]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[57]  Jintao Zhang,et al.  Inductive multi-task learning with multiple view data , 2012, KDD.

[58]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[59]  Carina Silberer,et al.  Learning Grounded Meaning Representations with Autoencoders , 2014, ACL.

[60]  Honglak Lee,et al.  Improved Multimodal Deep Learning with Variation of Information , 2014, NIPS.

[61]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[62]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[64]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[65]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[66]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Seong-Whan Lee,et al.  Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis , 2014, NeuroImage.

[68]  Shiliang Sun,et al.  Multiple-view multiple-learner active learning , 2010, Pattern Recognit..

[69]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[70]  Armand Joulin,et al.  Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.

[71]  Wen Zhang,et al.  TSFS: A Novel Algorithm for Single View Co-training , 2009, 2009 International Joint Conference on Computational Sciences and Optimization.

[72]  Ion Muslea,et al.  Active Learning with Multiple Views , 2009, Encyclopedia of Data Warehousing and Mining.

[73]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[74]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[75]  Patrick Gallinari,et al.  Ranking with ordered weighted pairwise classification , 2009, ICML '09.

[76]  Ming Liu,et al.  Multimodal DBN for Predicting High-Quality Answers in cQA portals , 2013, ACL.

[77]  Shiliang Sun,et al.  Multi-view clustering ensembles , 2013, 2013 International Conference on Machine Learning and Cybernetics.

[78]  Dan Zhang,et al.  Multi-view transfer learning with a large margin approach , 2011, KDD.

[79]  Shiliang Sun,et al.  PAC-Bayes analysis of multi-view learning , 2014, Inf. Fusion.

[80]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[81]  Quoc V. Le,et al.  Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.

[82]  Dacheng Tao,et al.  Large-Margin Multi-ViewInformation Bottleneck , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[83]  Ruifan Li,et al.  Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.

[84]  Jean-Luc Dugelay,et al.  Face aging with conditional generative adversarial networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[85]  Tommi S. Jaakkola,et al.  Maximum Entropy Discrimination , 1999, NIPS.

[86]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[87]  Shih-Fu Chang,et al.  Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[88]  Ivor W. Tsang,et al.  Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering , 2011, IEEE Transactions on Neural Networks.

[89]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[90]  Xiaogang Wang,et al.  Multi-source Deep Learning for Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[91]  Koray Kavukcuoglu,et al.  Multiple Object Recognition with Visual Attention , 2014, ICLR.

[92]  Liang Ge,et al.  Multi-source deep learning for information trustworthiness estimation , 2013, KDD.

[93]  Dimitris Achlioptas,et al.  On Spectral Learning of Mixtures of Distributions , 2005, COLT.

[94]  Xuelong Li,et al.  Multi-view Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[95]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[96]  Wei Chen,et al.  Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework , 2015, AAAI.

[97]  Ulf Leser,et al.  Systematic feature evaluation for gene name recognition , 2005, BMC Bioinformatics.

[98]  Yansong Feng,et al.  Visual Information in Semantic Representation , 2010, NAACL.

[99]  Chong-Wah Ngo,et al.  Mutlimodal Learning with Deep Boltzmann Machine for Emotion Prediction in User Generated Videos , 2015, ICMR.

[100]  Shiliang Sun,et al.  Soft Margin Consistency Based Scalable Multi-View Maximum Entropy Discrimination , 2016, IJCAI.

[101]  Nitish Srivastava,et al.  Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .

[102]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[103]  Shiliang Sun,et al.  Active learning with extremely sparse labeled examples , 2010, Neurocomputing.

[104]  Craig A. Knoblock,et al.  Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[105]  Jingjing Tang,et al.  Multiview Privileged Support Vector Machines , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[106]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .

[107]  Xirong Li,et al.  Word2VisualVec: Cross-Media Retrieval by Visual Feature Prediction , 2016, ArXiv.

[108]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[109]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[110]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[111]  Wei Gao,et al.  Multi-View Discriminant Transfer Learning , 2013, IJCAI.

[112]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[113]  Christopher Meek,et al.  Semantic Parsing for Single-Relation Question Answering , 2014, ACL.

[114]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[115]  Bernt Schiele,et al.  Learning Deep Representations of Fine-Grained Visual Descriptions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[116]  Shiliang Sun,et al.  View Construction for Multi-view Semi-supervised Learning , 2011, ISNN.