Social web video clustering based on multi-view clustering via nonnegative matrix factorization

Social web videos are rich data sources containing valuable information, which have a great potential to improve the performance of social web video clustering. Social web video data usually present a characteristic of multiple views. Multi-view clustering provides a useful way to generate clusters from multi-view data. Previous studies have applied different single-view data to do social web video clustering and classification; however, multi-view data has not been a factor considered in these methods. Therefore, in this paper, we propose a framework based on a novel online multi-view clustering algorithm (called SOMVCS) to cluster social web videos with large-scale possibly incomplete views into meaningful clusters. SOMVCS learns the latent feature matrices from all the views and then drives them towards a common consensus matrix based on nonnegative matrix factorization (NMF). Particularly, we incorporate graph regularization to preserve local structure information in the model. The experimental results show that online multi-view clustering via NMF is a preferable method for social web video clustering. Moreover, we find that using multi-view data with feature types from different feature families to do social web video clustering outperforms that using data with only the feature type from a single family.

[1]  Vahab S. Mirrokni,et al.  Large-Scale Community Detection on YouTube for Topic Discovery and Exploration , 2011, ICWSM.

[2]  Ming Li,et al.  Feature extraction via multi-view non-negative matrix factorization with local graph regularization , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[3]  Emilio L. Zapata,et al.  A Clustering Technique for Video Copy Detection , 2007, IbPRIA.

[4]  Derek Greene,et al.  A Matrix Factorization Approach for Integrating Multiple Data Views , 2009, ECML/PKDD.

[5]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[6]  Haroon Idrees,et al.  NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Jun Zhao,et al.  Sentiment Classification with Graph Co-Regularization , 2014, COLING.

[8]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[9]  Hong Yu,et al.  Constrained NMF-Based Multi-View Clustering on Unmapped Data , 2015, AAAI.

[10]  Jiaheng Lu,et al.  Clustering Web video search results based on integration of multiple features , 2010, World Wide Web.

[11]  Xiaohua Hu,et al.  Linking Heterogeneous Input Features with Pivots for Domain Adaptation , 2015, IJCAI.

[12]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[13]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[14]  Philip S. Yu,et al.  Multiple Incomplete Views Clustering via Weighted Nonnegative Matrix Factorization with L2, 1 Regularization , 2015, ECML/PKDD.

[15]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[16]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[17]  Philip S. Yu,et al.  Online Unsupervised Multi-view Feature Selection , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[18]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  Fei Wang,et al.  Efficient Document Clustering via Online Nonnegative Matrix Factorizations , 2011, SDM.

[20]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[21]  Shih-Fu Chang,et al.  Semantic video clustering across sources using bipartite spectral clustering , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[22]  Tianrui Li,et al.  Social Web Videos Clustering Based on Ensemble Technique , 2016, IJCRS.

[23]  Tianrui Li,et al.  Semi-supervised evolutionary ensembles for Web video categorization , 2015, Knowl. Based Syst..

[24]  Hiroyuki Kitagawa,et al.  Effective web video clustering using playlist information , 2012, SAC '12.

[25]  Zhigang Luo,et al.  Online Nonnegative Matrix Factorization With Robust Stochastic Approximation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[27]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[28]  L. Lovász Matching Theory (North-Holland mathematics studies) , 1986 .

[29]  Manfred Georg,et al.  On using nearly-independent feature families for high precision and confidence , 2012, Machine Learning.

[30]  Christian Bauckhage,et al.  Non-negative Matrix Factorization in Multimodality Data for Segmentation and Label Prediction , 2011 .

[31]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[32]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).