Multi-View Concept Learning for Data Representation

Real-world datasets often involve multiple views of data items, e.g., a Web page can be described by both its content and anchor texts of hyperlinks leading to it; photos in Flickr could be characterized by visual features, as well as user contributed tags. Different views provide information complementary to each other. Synthesizing multi-view features can lead to a comprehensive description of the data items, which could benefit many data analytic applications. Unfortunately, the simple idea of concatenating different feature vectors ignores statistical properties of each view and usually incurs the “curse of dimensionality” problem. We propose Multi-view Concept Learning (MCL), a novel nonnegative latent representation learning algorithm for capturing conceptual factors from multi-view data. MCL exploits both multi-view information and label information. The key idea is to learn a common latent space across different views which (1) captures the semantic relationships between data items through graph embedding regularization on labeled items, and (2) allows each latent factor to be associated with a subset of views via sparseness constraints. In this way, MCL could capture flexible conceptual patterns hidden in multi-view features. Experiments on a toy problem and three real-world datasets show that MCL performs well and outperforms baseline methods.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  Yves Grandvalet,et al.  Composite kernel learning , 2008, ICML '08.

[3]  Philip S. Yu,et al.  A General Model for Multiple View Unsupervised Learning , 2008, SDM.

[4]  Shiliang Sun,et al.  Multi-view Transfer Learning with Adaboost , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[5]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[6]  N. Mohammadiha,et al.  Nonnegative matrix factorization using projected gradient algorithms with sparseness constraints , 2009, 2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[7]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[8]  Jing Liu,et al.  Semi-supervised Unified Latent Factor learning with multi-view data , 2013, Machine Vision and Applications.

[9]  Yu-Jin Zhang,et al.  Nonnegative Matrix Factorization: A Comprehensive Review , 2013, IEEE Transactions on Knowledge and Data Engineering.

[10]  Jiawei Han,et al.  Parallel Field Ranking , 2012, TKDD.

[11]  TaoDacheng,et al.  Sparse Unsupervised Dimensionality Reduction for Multiple View Data , 2012 .

[12]  Trevor Darrell,et al.  Factorized Latent Spaces with Structured Sparsity , 2010, NIPS.

[13]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[14]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[16]  Trevor Darrell,et al.  An efficient projection for l 1 , infinity regularization. , 2009, ICML 2009.

[17]  Zhi-Hua Zhou,et al.  A New Analysis of Co-Training , 2010, ICML.

[18]  Daniel D. Lee,et al.  Multiplicative Updates for Nonnegative Quadratic Programming , 2007, Neural Computation.

[19]  Rainer Lienhart,et al.  Multimodal Image Retrieval , 2012, International Journal of Multimedia Information Retrieval.

[20]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[21]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[22]  Ioannis Pitas,et al.  A Novel Discriminant Non-Negative Matrix Factorization Algorithm With Applications to Facial Image Characterization Problems , 2007, IEEE Transactions on Information Forensics and Security.

[23]  Hyunsoo Kim,et al.  Sparse Non-negative Matrix Factorizations via Alternating Non-negativity-constrained Least Squares , 2006 .

[24]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[25]  Rajesh P. N. Rao,et al.  Learning Shared Latent Structure for Image Synthesis and Robotic Imitation , 2005, NIPS.

[26]  Christos Faloutsos,et al.  On the 'Dimensionality Curse' and the 'Self-Similarity Blessing' , 2001, IEEE Trans. Knowl. Data Eng..

[27]  John Blitzer,et al.  Co-Training for Domain Adaptation , 2011, NIPS.

[28]  Patrik O. Hoyer,et al.  Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[29]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[30]  David Carmel,et al.  Social media recommendation based on people and tags , 2010, SIGIR.

[31]  Xiaofei He,et al.  Parallel vector field embedding , 2013, J. Mach. Learn. Res..

[32]  PengJinye,et al.  Multi-View Concept Learning for Data Representation , 2015 .

[33]  Haroon Idrees,et al.  NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Yueting Zhuang,et al.  Sparse Unsupervised Dimensionality Reduction for Multiple View Data , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[37]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[38]  Ethem Alpaydın,et al.  Combined 5 x 2 cv F Test for Comparing Supervised Classification Learning Algorithms , 1999, Neural Comput..

[39]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[40]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[41]  Zenglin Xu,et al.  Simple and Efficient Multiple Kernel Learning by Group Lasso , 2010, ICML.

[42]  Yongdong Zhang,et al.  Multiview Spectral Embedding , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[43]  Meng Wang,et al.  MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[44]  Yunde Jia,et al.  FISHER NON-NEGATIVE MATRIX FACTORIZATION FOR LEARNING LOCAL FEATURES , 2004 .

[45]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[46]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[47]  Anastasios Tefas,et al.  Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification , 2006, IEEE Transactions on Neural Networks.

[48]  Ning Chen,et al.  Predictive Subspace Learning for Multi-view Data: a Large Margin Approach , 2010, NIPS.