Mixture Model Clustering of Uncertain Data

This paper addresses the problem of fitting mixture densities to uncertain data using the EM algorithm. Uncertain data are modelled by multivariate uncertainty zones which constitute a generalization of multivariate interval-valued data. We develop an EM algorithm to treat uncertainty zones around points of Ropfp in order to estimate the parameters of a mixture model defined on Ropfp and obtain a fuzzy clustering or partition. This EM algorithm requires the evaluation of multidimensional integrals over each uncertainty zone at each iteration. In the diagonal Gaussian mixture model case, these integrals can be computed by simply using the one-dimensional normal cumulative distribution function. Results on simulated data indicate that the proposed algorithm can estimate the true underlying density better than the classical EM algorithm applied to the imprecise data, especially when the imprecision degree is high