Iterative sIB algorithm

Recent years have witnessed a growing interest in the information bottleneck theory. Among the relevant algorithms in the extant literature, the sequential Information Bottleneck (sIB) algorithm is recognized for its balance between accuracy and complexity. However, like many other optimization techniques, it still suffers from the problem of getting easily trapped in local optima. To that end, our study proposed an iterative sIB algorithm (isIB) based on mutation for the clustering problem. From initial solution vectors of cluster labels generated by a seeding the sIB algorithm, our algorithm randomly selects a subset of elements and mutates the cluster labels according to the optimal mutation rate. The results are iteratively optimized further using genetic algorithms. Finally, the experimental results on the benchmark data sets validate the advantage of our iterative sIB algorithm over the sIB algorithm in terms of both accuracy and efficiency.

[1]  Thomas Hofmann,et al.  Non-redundant data clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[2]  Naftali Tishby,et al.  Data Clustering by Markovian Relaxation and the Information Bottleneck Method , 2000, NIPS.

[3]  Gal Chechik,et al.  Extracting Relevant Structures with Side Information , 2002, NIPS.

[4]  Naftali Tishby,et al.  Objective Classification of Galaxy Spectra using the Information Bottleneck Method , 2000, astro-ph/0005306.

[5]  Shih-Fu Chang,et al.  Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation , 2005, CIVR.

[6]  Naftali Tishby,et al.  Sufficient Dimensionality Reduction , 2003, J. Mach. Learn. Res..

[7]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[8]  Thomas Hofmann,et al.  Conditional Information Bottleneck Clustering , 2008 .

[9]  Naftali Tishby,et al.  Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[10]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[11]  Naftali Tishby,et al.  Extraction of relevant speech features using the information bottleneck method , 2005, INTERSPEECH.

[12]  Gal Chechik,et al.  Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..

[13]  Nir Friedman,et al.  Learning Hidden Variable Networks: The Information Bottleneck Approach , 2005, J. Mach. Learn. Res..

[14]  Michael J. Berry,et al.  An Information Theoretic Approach to the Functional Classification of Neurons , 2002, NIPS.

[15]  Shiri Gordon,et al.  Unsupervised image-set clustering using an information theoretic framework , 2006, IEEE Transactions on Image Processing.

[16]  Naftali Tishby,et al.  Unsupervised document classification using sequential information maximization , 2002, SIGIR '02.

[17]  Naftali Tishby,et al.  The Power of Word Clusters for Text Classification , 2006 .

[18]  Samuel Kaski,et al.  Sequential information bottleneck for finite data , 2004, ICML.

[19]  Naftali Tishby,et al.  Multivariate Information Bottleneck , 2001, Neural Computation.