Model Validation for Model Selection

Gaussian mixture modelling is used to provide a semi-parametric density description for a given data set. The fundamental problem with this approach is that the number of mixtures required to adequately describe the data is not known in advance. In our previous work [12] we introduced a new concept, termed Predictive Validation as a basis for an automatic method to select the number of components. In this paper we investigate the influence of the various parameters in our model selection method in order to develop it into an operational tool. We also demonstrate the utility of our model validation method to two applications in which the selected models are used for supervised classification and outlier detection tasks.