论文信息 - A feature-free and parameter-light multi-task clustering framework

A feature-free and parameter-light multi-task clustering framework

The two last decades have witnessed extensive research on multi-task learning algorithms in diverse domains such as bioinformatics, text mining, natural language processing as well as image and video content analysis. However, all existing multi-task learning methods require either domain-specific knowledge to extract features or a careful setting of many input parameters. There are many disadvantages associated with prior knowledge requirements for feature extraction or parameter-laden approaches. One of the most obvious problems is that we may find a wrong or non-existent pattern because of poorly extracted features or incorrectly set parameters. In this work, we propose a feature-free and parameter-light multi-task clustering framework to overcome these disadvantages. Our proposal is motivated by the recent successes of Kolmogorov-based methods on various applications. However, such methods are only defined for single-task problems because they lack a mechanism to share knowledge between different tasks. To address this problem, we create a novel dictionary-based compression dissimilarity measure that allows us to share knowledge across different tasks effectively. Experimental results with extensive comparisons demonstrate the generality and the effectiveness of our proposal.

Thach Huy Nguyen | Hao Shao | Einoshin Suzuki | Bin Tong

[1] Dimitrios I. Fotiadis,et al. An optimized sequential pattern matching methodology for sequence classification , 2009, Knowledge and Information Systems.

[2] Ashish Verma,et al. Cross-Guided Clustering: Transfer of Relevant Supervision across Domains for Improved Clustering , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[3] Eamonn J. Keogh,et al. Towards parameter-free data mining , 2004, KDD.

[4] Seiichi Ozawa,et al. A Multitask Learning Model for Online Pattern Recognition , 2009, IEEE Transactions on Neural Networks.

[5] Szymon Grabowski,et al. Revisiting dictionary‐based compression , 2005, Softw. Pract. Exp..

[6] D. Mount. Bioinformatics: Sequence and Genome Analysis , 2001 .

[7] Ming Li,et al. Normalized Information Distance , 2008, ArXiv.

[8] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.

[9] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[10] Ronald de Wolf,et al. Algorithmic Clustering of Music Based on String Compression , 2004, Computer Music Journal.

[11] Ya Zhang,et al. Multi-task learning for boosting with application to web search ranking , 2010, KDD.

[12] Wei Liu,et al. Extending Semi-supervised Learning Methods for Inductive Transfer Learning , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[13] Xin Chen,et al. A compression algorithm for DNA sequences and its applications in genome comparison , 2000, RECOMB '00.

[14] Quanquan Gu,et al. Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[15] Eamonn J. Keogh,et al. A Compression Based Distance Measure for Texture , 2010, SDM.

[16] Jonathan Baxter,et al. Learning internal representations , 1995, COLT '95.

[17] Sebastian Thrun,et al. Learning to Learn , 1998, Springer US.

[18] M. M. Hassan Mahmud. On Universal Transfer Learning , 2007, ALT.

[19] Lluís A. Belanche Muñoz,et al. Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[20] Michael K. Ng,et al. Knowledge-based vector space model for text clustering , 2010, Knowledge and Information Systems.

[21] Julio Gonzalo,et al. A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[22] Inderjit S. Dhillon,et al. Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[23] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[24] Qiang Yang,et al. Transferring Multi-device Localization Models using Latent Multi-task Learning , 2008, AAAI.

[25] Hui Li,et al. Semisupervised Multitask Learning , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] M. M. Hassan Mahmud,et al. Transfer Learning using Kolmogorov Complexity: Basic Theory and Empirical Evaluations , 2007, NIPS.

[27] Terry A. Welch,et al. A Technique for High-Performance Data Compression , 1984, Computer.

[28] S. Carroll,et al. Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.

[29] Jiawei Han,et al. Learning a Kernel for Multi-Task Clustering , 2011, AAAI.

[30] Jianwen Zhang,et al. Multitask Bregman clustering , 2010, Neurocomputing.

[31] Vittorio Loreto,et al. Language trees and zipping. , 2002, Physical review letters.

[32] Anton Schwaighofer,et al. Learning Gaussian Process Kernels via Hierarchical Bayes , 2004, NIPS.

[33] Yiming Yang,et al. Learning Multiple Related Tasks using Latent Independent Component Analysis , 2005, NIPS.

[34] Ming Li,et al. Clustering by compression , 2003, IEEE International Symposium on Information Theory, 2003. Proceedings..

[35] I K Fodor,et al. A Survey of Dimension Reduction Techniques , 2002 .

[36] Qiang Yang,et al. Boosting for transfer learning , 2007, ICML '07.

[37] John Blitzer,et al. Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[38] Elena Baralis,et al. Measuring gene similarity by means of the classification distance , 2011, Knowledge and Information Systems.

[39] Qiang Yang,et al. Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study , 2010, BMC Bioinformatics.

[40] Vipin Kumar,et al. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[41] Neil D. Lawrence,et al. Learning to learn with the informative vector machine , 2004, ICML.

[42] Jonas S. Almeida,et al. Alignment-free sequence comparison-a review , 2003, Bioinform..

[43] Brendan Juba,et al. Estimating relatedness via data compression , 2006, ICML.

[44] H. Hirsh,et al. DNA Sequence Classification Using Compression-Based Induction , 1995 .

[45] Wei-Ying Ma,et al. Collaborative Ensemble Learning: Combining Collaborative and Content-Based Information Filtering via Hierarchical Bayes , 2002, UAI.

[46] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[47] Forrest W. Young,et al. Introduction to Multidimensional Scaling: Theory, Methods, and Applications , 1981 .

[48] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.

[49] Charles A. Micchelli,et al. Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[50] Jian Pei,et al. A brief survey on sequence classification , 2010, SKDD.

[51] Trevor I. Dix,et al. Sequence Complexity for Biological Sequence Analysis , 2000, Comput. Chem..

[52] Naftali Tishby,et al. Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[53] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[54] Tuan D. Pham,et al. A probabilistic measure for alignment-free sequence comparison , 2004, Bioinform..

[55] George Karypis,et al. A Comparison of Document Clustering Techniques , 2000 .

[56] Deepak Agarwal,et al. Detecting anomalies in cross-classified streams: a Bayesian approach , 2006, Knowledge and Information Systems.

[57] Paul M. B. Vitányi,et al. The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[58] Bin Ma,et al. The similarity metric , 2001, IEEE Transactions on Information Theory.