Online Bayesian max-margin subspace learning for multi-view classification and regression

Multi-view data have become increasingly popular in many real-world applications where data are generated from different information channels or different views such as image + text, audio + video, and webpage + link data. Last decades have witnessed a number of studies devoted to multi-view learning algorithms, especially the predictive latent subspace learning approaches which aim at obtaining a subspace shared by multiple views and then learning models in the shared subspace. However, few efforts have been made to handle online multi-view learning scenarios. In this paper, we propose an online Bayesian multi-view learning algorithm which learns predictive subspace with the max-margin principle. Specifically, we first define the latent margin loss for classification or regression in the subspace, and then cast the learning problem into a variational Bayesian framework by exploiting the pseudo-likelihood and data augmentation idea. With the variational approximate posterior inferred from the past samples, we can naturally combine historical knowledge with new arrival data, in a Bayesian passive-aggressive style. Finally, we extensively evaluate our model on several real-world data sets and the experimental results show that our models can achieve superior performance, compared with a number of state-of-the-art competitors.

[1]  Wei Wang,et al.  Learning Coupled Feature Spaces for Cross-Modal Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  Ling Shao,et al.  Dynamic Multi-View Hashing for Online Image Retrieval , 2017, IJCAI.

[3]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[4]  Ting Hu,et al.  ONLINE REGRESSION WITH VARYING GAUSSIANS AND NON-IDENTICAL DISTRIBUTIONS , 2011 .

[5]  Yun Fu,et al.  Multi-View Clustering via Deep Matrix Factorization , 2017, AAAI.

[6]  Philip S. Yu,et al.  Online Unsupervised Multi-view Feature Selection , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[7]  Steven L. Scott,et al.  Rejoinder: "Data augmentation for support vector machines" , 2011 .

[8]  Min Xiao,et al.  Cross Language Text Classification via Subspace Co-regularized Multi-view Learning , 2012, ICML.

[9]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[10]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[11]  Zhixiang Chen,et al.  Collaborative multiview hashing , 2018, Pattern Recognit..

[12]  Ivica Kopriva,et al.  Multi-view low-rank sparse subspace clustering , 2017, Pattern Recognit..

[13]  Shiliang Sun,et al.  Multi-View Maximum Entropy Discrimination , 2013, IJCAI.

[14]  Guoping Long,et al.  Online variational Bayesian Support Vector Regression , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[15]  Ning Chen,et al.  Bayesian inference with posterior regularization and applications to infinite latent SVMs , 2012, J. Mach. Learn. Res..

[16]  Aman Mohammad Kalteh,et al.  Monthly river flow forecasting using artificial neural network and support vector regression models coupled with wavelet transform , 2013, Comput. Geosci..

[17]  Philip S. Yu,et al.  A General Model for Multiple View Unsupervised Learning , 2008, SDM.

[18]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[19]  Xin Yin,et al.  Online Bayesian Max-Margin Subspace Multi-View Learning , 2016, IJCAI.

[20]  Yuan Qi,et al.  Bayesian Maximum Margin Principal Component Analysis , 2015, AAAI.

[21]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[23]  Hui Jiang,et al.  Grey relational grade in local support vector regression for financial time series prediction , 2012, Expert Syst. Appl..

[24]  Duy Nguyen-Tuong,et al.  Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.

[25]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[26]  Qinghua Hu,et al.  Latent Multi-view Subspace Clustering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yu Zheng,et al.  Urban Water Quality Prediction Based on Multi-Task Multi-View Learning , 2016, IJCAI.

[28]  Zhi-Hua Zhou,et al.  Rank Consistency based Multi-View Learning: A Privacy-Preserving Approach , 2015, CIKM.

[29]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[30]  Francesco Parrella Online Support Vector Regression , 2007 .

[31]  Ning Chen,et al.  Gibbs max-margin topic models with data augmentation , 2013, J. Mach. Learn. Res..

[32]  Lin F. Yang,et al.  Online Partial Least Square Optimization: Dropping Convexity for Better Efficiency and Scalability , 2017, ICML.

[33]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[35]  J. Rosenthal,et al.  Markov Chain Monte Carlo , 2018 .

[36]  Dong Liu,et al.  Robust late fusion with rank minimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Nicholas G. Polson,et al.  Data augmentation for support vector machines , 2011 .

[38]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[39]  Feiping Nie,et al.  A Closed Form Solution to Multi-View Low-Rank Regression , 2015, AAAI.

[40]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models for regression and classification , 2009, ICML '09.

[41]  Farookh Khadeer Hussain,et al.  Support vector regression with chaos-based firefly algorithm for stock market price forecasting , 2013, Appl. Soft Comput..

[42]  Zhi-Hua Zhou,et al.  One-Pass Multi-View Learning , 2015, ACML.

[43]  Vittorio Murino,et al.  A unifying framework for vector-valued manifold regularization and multi-view learning , 2013, ICML.

[44]  Fuchun Sun,et al.  Large-Margin Predictive Latent Subspace Learning for Multiview Data Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Ming Yang,et al.  Multi-View Representation Learning: A Survey from Shallow Methods to Deep Methods , 2016, ArXiv.

[46]  R. Urbanczik,et al.  SELF-AVERAGING AND ON-LINE LEARNING , 1998, cond-mat/9805339.

[47]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[48]  Jun Zhu,et al.  Online Bayesian Passive-Aggressive Learning , 2013, ICML.

[49]  Jie Yu,et al.  Short-term wind speed prediction using an unscented Kalman filter based state-space support vector regression approach , 2014 .

[50]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[51]  Philip S. Yu,et al.  Multilinear Factorization Machines for Multi-Task Multi-View Learning , 2017, WSDM.

[52]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[53]  Ping-Feng Pai,et al.  Revenue forecasting using a least-squares support vector regression model in a fuzzy environment , 2013, Inf. Sci..

[54]  Saharon Rosset,et al.  A new multi-view regression approach with an application to customer wallet estimation , 2006, KDD '06.

[55]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[56]  Chao Lan,et al.  Co-regularized least square regression for multi-view multi-class classification , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).