Model centroids for the simplification of Kernel Density estimators

Gaussian mixture models are a widespread tool for modeling various and complex probability density functions. They can be estimated using Expectation- Maximization or Kernel Density Estimation. Expectation- Maximization leads to compact models but may be expensive to compute whereas Kernel Density Estimation yields to large models which are cheap to build. In this paper we present new methods to get high-quality models that are both compact and fast to compute. This is accomplished with clustering methods and centroids computation. The quality of the resulting mixtures is evaluated in terms of log-likelihood and Kullback-Leibler divergence using examples from a bioinformatics application.