A Compression-Based Dissimilarity Measure for Multi-task Clustering

Virtually all existing multi-task learning methods for string data require either domain specific knowledge to extract feature representations or a careful setting of many input parameters. In this work, we propose a feature-free and parameter-light multi-task clustering algorithm for string data. To transfer knowledge between different domains, a novel dictionary-based compression dissimilarity measure is proposed. Experimental results with extensive comparisons demonstrate the generality and the effectiveness of our proposal.

[1]  Ming Li,et al.  Normalized Information Distance , 2008, ArXiv.

[2]  Eamonn J. Keogh,et al.  Towards parameter-free data mining , 2004, KDD.

[3]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[4]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[5]  Quanquan Gu,et al.  Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[6]  Ashish Verma,et al.  Cross-Guided Clustering: Transfer of Relevant Supervision across Domains for Improved Clustering , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[7]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[8]  M. M. Hassan Mahmud On Universal Transfer Learning , 2007, ALT.

[9]  Jianwen Zhang,et al.  Multitask Bregman clustering , 2010, Neurocomputing.

[10]  Jiawei Han,et al.  Non-negative Matrix Factorization on Manifold , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[11]  Hui Li,et al.  Semisupervised Multitask Learning , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  M. M. Hassan Mahmud,et al.  Transfer Learning using Kolmogorov Complexity: Basic Theory and Empirical Evaluations , 2007, NIPS.

[13]  Terry A. Welch,et al.  A Technique for High-Performance Data Compression , 1984, Computer.

[14]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[15]  Anton Schwaighofer,et al.  Learning Gaussian Process Kernels via Hierarchical Bayes , 2004, NIPS.

[16]  Naftali Tishby,et al.  Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[17]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[18]  Brendan Juba,et al.  Estimating relatedness via data compression , 2006, ICML.

[19]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.