Maximum Likelihood and the Information Bottleneck

The information bottleneck (IB) method is an information-theoretic formulation for clustering problems. Given a joint distribution p(x, y), this method constructs a new variable T that defines partitions over the values of X that are informative about Y. Maximum likelihood (ML) of mixture models is a standard statistical approach to clustering problems. In this paper, we ask: how are the two methods related? We define a simple mapping between the IB problem and the ML problem for the multinomial mixture model. We show that under this mapping the problems are strongly related. In fact, for uniform input distribution over or for large sample size, the problems are mathematically equivalent. Specifically, in these cases, every fixed point of the IB-functional defines a fixed point of the (log) likelihood and vice versa. Moreover, the values of the functionals at the fixed points are equal under simple transformations. As a result, in these cases, every algorithm that solves one of the problems, induces a solution for the other.