Distributed information-theoretic clustering

We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sequences $X^n$ and $Y^n$, respectively. The goal is to find rate-limited encodings $f(x^n)$ and $g(z^n)$ that maximize the mutual information $\textrm{I}(\,{f(X^n)};{g(Y^n)})/n$. We discuss connections of this problem with hypothesis testing against independence, pattern recognition and the information bottleneck method. Improving previous cardinality bounds for the inner and outer bounds allows us to thoroughly study the special case of a binary symmetric source and to quantify the gap between the inner and the outer bound in this special case. Furthermore, we investigate a multiple description (MD) extension of the CEO problem with mutual information constraint. Surprisingly, this MD-CEO problem permits a tight single-letter characterization of the achievable region.

[1]  Sebastian Nowozin,et al.  Information Theoretic Clustering Using Minimum Spanning Trees , 2012, DAGM/OAGM Symposium.

[2]  Thomas M. Cover,et al.  Network Information Theory , 2001 .

[3]  Aaron D. Wyner,et al.  On source coding with side information at the decoder , 1975, IEEE Trans. Inf. Theory.

[4]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  K. Ball CONVEX BODIES: THE BRUNN–MINKOWSKI THEORY , 1994 .

[6]  Joseph A. O'Sullivan,et al.  Achievable Rates for Pattern Recognition , 2005, IEEE Transactions on Information Theory.

[7]  Chandra Nair,et al.  Upper concave envelopes and auxiliary random variables , 2013 .

[8]  W. Bialek,et al.  Information-based clustering. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  C. Shannon Coding Theorems for a Discrete Source With a Fidelity Criterion-Claude , 2009 .

[10]  Thomas A. Courtade,et al.  Which Boolean Functions Maximize Mutual Information on Noisy Inputs? , 2014, IEEE Transactions on Information Theory.

[11]  Aaron D. Wyner,et al.  The rate-distortion function for source coding with side information at the decoder , 1976, IEEE Trans. Inf. Theory.

[12]  Roded Sharan,et al.  Biclustering Algorithms: A Survey , 2007 .

[13]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  Udi Ben Porat,et al.  Analysis of Biological Networks : Network Modules – Clustering and Biclustering ∗ , 2006 .

[15]  Hans S. Witsenhausen,et al.  A conditional entropy bound for a pair of discrete random variables , 1975, IEEE Trans. Inf. Theory.

[16]  W. Rudin Principles of mathematical analysis , 1964 .

[17]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[18]  Gerald Matz,et al.  Distributed information-theoretic biclustering , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[19]  Gerald Matz,et al.  A Tight Upper Bound on the Mutual Information of Two Boolean Functions , 2016, 2016 IEEE Information Theory Workshop (ITW).

[20]  Fei Sha,et al.  Demystifying Information-Theoretic Clustering , 2013, ICML.

[21]  Naftali Tishby,et al.  An Information Theoretic Tradeoff between Complexity and Accuracy , 2003, COLT.

[22]  藤重 悟 Submodular functions and optimization , 1991 .

[23]  Vivek K. Goyal,et al.  Multiple description coding: compression meets the network , 2001, IEEE Signal Process. Mag..

[24]  H. Witsenhausen ON SEQUENCES OF PAIRS OF DEPENDENT RANDOM VARIABLES , 1975 .

[25]  Rudolf Ahlswede,et al.  Hypothesis testing with communication constraints , 1986, IEEE Trans. Inf. Theory.

[26]  Boris Mirkin,et al.  Mathematical Classification and Clustering , 1996 .

[27]  R. Baker Kearfott,et al.  Introduction to Interval Analysis , 2009 .

[28]  Varun Jog,et al.  An information inequality for the BSSC broadcast channel , 2010, 2010 Information Theory and Applications Workshop (ITA).

[29]  嘉一 鷲沢,et al.  GNU Octave(前編) , 2011 .

[30]  Rune B. Lyngsø,et al.  Lecture Notes I , 2008 .

[31]  Kim C. Border,et al.  Infinite Dimensional Analysis: A Hitchhiker’s Guide , 1994 .

[32]  Abbas El Gamal,et al.  Achievable rates for multiple descriptions , 1982, IEEE Trans. Inf. Theory.

[33]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[34]  Te Han,et al.  Hypothesis testing with multiterminal data compression , 1987, IEEE Trans. Inf. Theory.

[35]  Rudolf Ahlswede,et al.  On the connection between the entropies of input and output distributions of discrete memoryless channels , 1977 .

[36]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[37]  Abbas El Gamal,et al.  Network Information Theory , 2021, 2021 IEEE 3rd International Conference on Advanced Trends in Information Theory (ATIT).

[38]  Martin Bossert,et al.  Canalizing Boolean Functions Maximize Mutual Information , 2012, IEEE Transactions on Information Theory.

[39]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[40]  Shun-ichi Amari,et al.  Statistical Inference Under Multiterminal Data Compression , 1998, IEEE Trans. Inf. Theory.

[41]  Thomas A. Courtade,et al.  Which Boolean functions are most informative? , 2013, 2013 IEEE International Symposium on Information Theory.

[42]  Hans S. Witsenhausen,et al.  Entropy inequalities for discrete channels , 1974, IEEE Trans. Inf. Theory.

[43]  L. Lovász,et al.  Annals of Discrete Mathematics , 1986 .

[44]  Aaron D. Wyner,et al.  A theorem on the entropy of certain binary sequences and applications-II , 1973, IEEE Trans. Inf. Theory.

[45]  Te Sun Han,et al.  A unified achievable rate region for a general class of multiterminal source coding systems , 1980, IEEE Trans. Inf. Theory.

[46]  Alexander Kraskov,et al.  MIC: Mutual Information Based Hierarchical Clustering , 2008, 0809.1605.

[47]  Elza Erkip,et al.  The Efficiency of Investment Information , 1998, IEEE Trans. Inf. Theory.

[48]  H. G. Eggleston Convexity by H. G. Eggleston , 1958 .

[49]  Aaron B. Wagner,et al.  Distributed Rate-Distortion With Common Components , 2011, IEEE Transactions on Information Theory.

[50]  Rudolf Ahlswede,et al.  Source coding with side information and a converse for degraded broadcast channels , 1975, IEEE Trans. Inf. Theory.

[51]  Tsachy Weissman,et al.  Multiterminal Source Coding Under Logarithmic Loss , 2011, IEEE Transactions on Information Theory.

[52]  Aaron D. Wyner,et al.  A theorem on the entropy of certain binary sequences and applications-I , 1973, IEEE Trans. Inf. Theory.

[53]  Venkat Anantharam,et al.  Evaluation of Marton's Inner Bound for the General Broadcast Channel , 2009, IEEE Transactions on Information Theory.

[54]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .