Clustering Univariate Observations via Mixtures of Unimodal Normal Mixtures

A mixture model is proposed in which any component is modelled in a flexible way through a unimodal mixture of normal distributions with the same variance and equispaced support points. The main application of the model is for clustering univariate observations where any component identifies a different cluster and conventional mixture models may lead to an overestimate of the number of clusters when the component distribution is misspecified. Maximum likelihood estimation of the model is carried on through an EM algorithm where the maximization of the complete log-likelihood under the constraint of unimodality is performed by solving a series of least squares problems under linear inequality constraints. The Bayesian Information Criterion is used to select the number of components. A simulation study shows that this criterion performs well even when the true component distribution has strong skewness and/or kurtosis. This is due to the flexibility of the proposed model and is particularly useful when the model is used for clustering.