Clustering with EM: Complex Models vs. Robust Estimation

Clustering multivariate data that are contaminated by noise is a complex issue, particularly in the framework of mixture model estimation because noisy data can significantly affect the parameters estimates. This paper addresses this problem with respect to likelihood maximization using the Expectation-Maximization algorithm. Two different approaches are compared. The first one consists in defining mixture models that take into account noise. The second one is based of robust estimation of the model parameters in the maximization step of EM. Both have been tested separately, then jointly. Finally, a hybrid model is proposed. Results on artificial data are given and discussed.