One-Pass Multi-View Learning

Multi-view learning has been an important learning paradigm where data come from multiple channels or appear in multiple modalities. Many approaches have been developed in this eld, and have achieved better performance than single-view ones. Those approaches, however, always work on small-size datasets with low dimensionality, owing to their high computational cost. In recent years, it has been witnessed that many applications involve large-scale multi-view data, e.g., hundreds of hours of video (including visual, audio and text views) is uploaded to YouTube every minute, bringing a big challenge to previous multi-view algorithms. This work concentrates on the large-scale multi-view learning for classication and proposes the One-Pass Multi-View (OPMV) framework which goes through the training data only once without storing the entire training examples. This approach jointly optimizes the composite objective functions with consistency linear constraints for dierent views. We verify, both theoretically and empirically, the eectiveness of the proposed algorithm.

[1]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[2]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[3]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[4]  Maria-Florina Balcan,et al.  Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.

[5]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[6]  Zhi-Hua Zhou,et al.  A New Analysis of Co-Training , 2010, ICML.

[7]  James T. Kwok,et al.  Asynchronous Distributed ADMM for Consensus Optimization , 2014, ICML.

[8]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[9]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[10]  Gilles Bisson,et al.  An Improved Co-Similarity Measure for Document Clustering , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[11]  Alexander G. Gray,et al.  Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[12]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[13]  Arindam Banerjee,et al.  Online Alternating Direction Method , 2012, ICML.

[14]  Gilles Bisson,et al.  Co-clustering of Multi-view Datasets: A Parallelizable Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[15]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[16]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[17]  Min Xiao,et al.  Cross Language Text Classification via Subspace Co-regularized Multi-view Learning , 2012, ICML.

[18]  Noah A. Smith,et al.  Making the Most of Bag of Words: Sentence Regularization with Alternating Direction Method of Multipliers , 2014, ICML.

[19]  Yuhong Guo,et al.  Convex Subspace Representation Learning from Multi-View Data , 2013, AAAI.

[20]  Zhi-Hua Zhou,et al.  Analyzing Co-training Style Algorithms , 2007, ECML.

[21]  s-taiji Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method , 2013 .

[22]  Yurii Nesterov,et al.  Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[23]  Ambuj Tewari,et al.  Composite objective mirror descent , 2010, COLT 2010.

[24]  Leon Wenliang Zhong,et al.  Fast Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[25]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[26]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[27]  Martha White,et al.  Convex Multi-view Subspace Learning , 2012, NIPS.

[28]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[29]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[30]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[31]  J. Sherman,et al.  Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix , 1950 .