Multi-task clustering through instances transfer

We propose a multi-task clustering method by transferring knowledge of instances.The sample distance in different tasks is reweighted by learning a shared subspace.Related samples from other tasks are reused as auxiliary data to aid clustering.Our method maintains the label marginal distribution of each individual task.Better performance is observed compared with other multi-task clustering methods. Clustering is an essential issue in machine learning and data mining. As there are many related tasks in the real world, multi-task clustering, which improves the clustering performance of each task by transferring knowledge across the related tasks, receives increasing attention recently. Generally knowledge transfer can be accomplished in different ways. Nevertheless, besides transferring knowledge of feature representations, other knowledge transfer ways have seldom been adopted for multi-task clustering. In this paper, we propose a general multi-task clustering algorithm by transferring knowledge of instances. Our algorithm reweights the distance between samples in different tasks by learning a shared subspace, then selects the nearest neighbors for each sample from the other tasks in the learned shared subspace as the auxiliary data to aid the clustering process of each individual task. Experiments on real data sets in text mining and image mining demonstrate that our proposed algorithm outperforms the traditional single-task clustering methods and existing cross-domain multi-task clustering methods.

[1]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[2]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[3]  Xin Geng,et al.  A multi-task model for simultaneous face identification and facial expression recognition , 2016, Neurocomputing.

[4]  Raymond J. Mooney,et al.  Transfer Learning from Minimal Target Data by Mapping across Relational Domains , 2009, IJCAI.

[5]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[6]  Neil D. Lawrence,et al.  Learning to learn with the informative vector machine , 2004, ICML.

[7]  Xiao-Lei Zhang,et al.  Convex Discriminative Multitask Clustering , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jie Zhou,et al.  Multi-task clustering via domain adaptation , 2012, Pattern Recognit..

[9]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[10]  Sridhar Mahadevan,et al.  Manifold alignment using Procrustes analysis , 2008, ICML '08.

[11]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[12]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[13]  Raymond J. Mooney,et al.  Mapping and Revising Markov Logic Networks for Transfer Learning , 2007, AAAI.

[14]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.

[15]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[16]  Francesco Dinuzzo,et al.  Learning output kernels for multi-task problems , 2013, Neurocomputing.

[17]  Pedro M. Domingos,et al.  Deep Transfer: A Markov Logic Approach , 2011, AI Mag..

[18]  Xiangyang Xue,et al.  Flexible multi-task learning with latent task grouping , 2016, Neurocomputing.

[19]  Philip S. Yu,et al.  Transfer Sparse Coding for Robust Image Representation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  René Vidal,et al.  Subspace Clustering , 2011, IEEE Signal Processing Magazine.

[21]  Jianwen Zhang,et al.  Multitask Bregman clustering , 2010, Neurocomputing.

[22]  Svetha Venkatesh,et al.  Regularized nonnegative shared subspace learning , 2011, Data Mining and Knowledge Discovery.

[23]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[24]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[25]  Shiliang Sun,et al.  Multitask centroid twin support vector machines , 2015, Neurocomputing.

[26]  Yu-Chiang Frank Wang,et al.  Unsupervised Domain Adaptation with Imbalanced Cross-Domain Data , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Jiawei Han,et al.  Learning a Kernel for Multi-Task Clustering , 2011, AAAI.

[28]  José Ragot,et al.  Multi-task learning with one-class SVM , 2014, Neurocomputing.

[29]  Gene H. Golub,et al.  Matrix computations , 1983 .

[30]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[31]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[32]  Charles A. Micchelli,et al.  A Spectral Regularization Framework for Multi-Task Structure Learning , 2007, NIPS.

[33]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[34]  Xianchao Zhang,et al.  Smart Multi-Task Bregman Clustering and Multi-Task Kernel Clustering , 2013, AAAI.

[35]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[36]  Zhisong Pan,et al.  Network traffic classification via non-convex multi-task feature learning , 2015, Neurocomputing.

[37]  Xianchao Zhang,et al.  Smart Multitask Bregman Clustering and Multitask Kernel Clustering , 2015, ACM Trans. Knowl. Discov. Data.

[38]  Dacheng Tao,et al.  Multi-Task Model and Feature Joint Learning , 2015, IJCAI.

[39]  Yunde Jia,et al.  Multi-task l0 gradient minimization for visual tracking , 2015, Neurocomputing.

[40]  Hongtao Lu,et al.  Multi-task co-clustering via nonnegative matrix factorization , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[41]  Quanquan Gu,et al.  Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[42]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  W. Härdle Nonparametric and Semiparametric Models , 2004 .

[44]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[45]  Xianchao Zhang,et al.  Self-Adapted Multi-Task Clustering , 2016, IJCAI.

[46]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[47]  Chandan K. Reddy,et al.  Multi-Task Clustering using Constrained Symmetric Non-Negative Matrix Factorization , 2014, SDM.

[48]  Qiang Yang,et al.  Transferring Naive Bayes Classifiers for Text Classification , 2007, AAAI.