Multi-task co-clustering via nonnegative matrix factorization

Recent results have empirically proved that, given several related tasks with different data distributions and an algorithm that can utilize both the task-specific and cross-task knowledge, clustering performance of each task can be significantly enhanced. This kind of unsupervised learning method is called multi-task clustering. We focus on tackling the multi-task clustering problem via a 3-factor nonnegative matrix factorization. The object of our approach consists of two parts: (1) Within-task co-clustering: co-cluster the data in the input space individually. (2) Cross-task regularization: Learn and refine the relations of feature spaces among different tasks. We show that our approach has a sound information theoretic background and the experimental evaluation shows that it outperforms many state-of-the-art single-task or multi-task clustering methods.

[1]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[2]  Jianmin Wang,et al.  Transfer Learning via Cluster Correspondence Inference , 2010, 2010 IEEE International Conference on Data Mining.

[3]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[4]  Chris H. Q. Ding,et al.  Bridging Domains with Words: Opinion Analysis with Matrix Tri-factorizations , 2010, SDM.

[5]  Dennis J. Snower,et al.  Multi-Task Learning and the Reorganization of Work , 1999 .

[6]  Hui Xiong,et al.  Exploiting associations between word clusters and document classes for cross-domain text categorization , 2011, Stat. Anal. Data Min..

[7]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[8]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[9]  Éric Gaussier,et al.  Relation between PLSA and NMF and implications , 2005, SIGIR '05.

[10]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[11]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[12]  Qiang Yang,et al.  Self-taught clustering , 2008, ICML '08.

[13]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Jiawei Han,et al.  Learning a Kernel for Multi-Task Clustering , 2011, AAAI.

[15]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[16]  Quanquan Gu,et al.  Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[17]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.