Collaborative Co-clustering across Multiple Social Media

Combining multiple source information can often lead to improved performance on the learning task. Information from different sources could potentially compensate the missing information in a single source. However, designing an effective combining scheme is not always straightforward in practice. This paper aims to combine information from multiple social media websites to enhance the co-clustering performance of two types of objects (social media objects and users) in one social network, since users could leave footprints across different social media websites, such as Twitter, Foursquare, etc. Data generated from multiple heterogeneous sources can be casted in a multi-view setting. Specifically, we construct the relationship matrix as relationship view and features of each individual object from different sources as different feature views. In previous works, features besides relationship matrix were added to co-clustering in the following manners: different features are taken indiscriminately, those features act as hard constraints to force final co-clusters agree with these constraints. A co-regularized collaborative co-clustering model (Co-CoClust) is proposed to simultaneously perform co-clustering on relationship view and clustering on multiple feature views. In this framework, features from different sources are treated discriminately since they are divided into separate views and utilized based on the distance from relationship matrix. Co-regularization technique is introduced to impose a common constraint between co-clustering and clusterings such that the relationship matrix and feature matrix are unified. By alternating minimization, results of co-clustering and clustering from different views are iteratively optimized. Therefore, co-clustering results are improved by leveraging multiple source information. The proposed algorithm proves its effectiveness in social media datasets and traditional document-word datasets.

[1]  Philip S. Yu,et al.  Efficient Semi-supervised Spectral Co-clustering with Constraints , 2010, 2010 IEEE International Conference on Data Mining.

[2]  Wei Cheng,et al.  Flexible and robust co-regularized multi-domain graph clustering , 2013, KDD.

[3]  Wei Tang,et al.  Clustering with Multiple Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[4]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[5]  Philip S. Yu,et al.  Concurrent goal-oriented co-clustering generation in social networks , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[6]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[7]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[8]  Kathryn B. Laskey,et al.  Feature Enriched Nonparametric Bayesian Co-clustering , 2012, PAKDD.

[9]  Jieping Ye,et al.  Multi-objective Multi-view Spectral Clustering via Pareto Optimization , 2013, SDM.

[10]  Luigi Pontieri,et al.  Coclustering Multiple Heterogeneous Domains: Linear Combinations and Agreements , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[12]  Min-Yen Kan,et al.  Comment-based multi-view clustering of web 2.0 items , 2014, WWW.

[13]  Furu Wei,et al.  Constrained co-clustering for textual documents , 2010, AAAI 2010.

[14]  Gilles Bisson,et al.  Co-clustering of Multi-view Datasets: A Parallelizable Approach , 2012, 2012 IEEE 12th International Conference on Data Mining.

[15]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[16]  Chia-Hui Chang,et al.  Co-clustering with Augmented Data Matrix , 2011, DaWaK.

[17]  K. Selçuk Candan,et al.  On context-aware co-clustering with metadata support , 2011, Journal of Intelligent Information Systems.

[18]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[19]  Mikhail Belkin,et al.  A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .

[20]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[21]  Shao-Yuan Li,et al.  Partial Multi-View Clustering , 2014, AAAI.

[22]  Kyumin Lee,et al.  Exploring Millions of Footprints in Location Sharing Services , 2011, ICWSM.

[23]  Jiawei Han,et al.  Multi-View Clustering via Joint Nonnegative Matrix Factorization , 2013, SDM.

[24]  V. D. Sa Spectral Clustering with Two Views , 2007 .