Determination of the number of components in Gaussian mixtures using agglomerative clustering

Modeling data sets by mixtures is a common technique in many pattern recognition applications. The expectation maximization (EM) algorithm for mixture decomposition suffers from the disadvantage that the number of components in the mixture needs to be specified. In this paper, we propose a new objective function, the minimum of which gives the number of components automatically. The proposed method, known as the agglomerative Gaussian mixture decomposition algorithm, is then used to determine the number of hidden nodes in a radial basis function network. We present results on real data sets which indicate that the proposed method is not sensitive to initialization and gives better classification rates.

[1]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[2]  Alex Pentland,et al.  Cooperative Robust Estimation Using Layers of Support , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Xinhua Zhuang,et al.  Gaussian mixture density modeling, decomposition, and applications , 1996, IEEE Trans. Image Process..

[4]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[5]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[6]  Jun Zhang,et al.  A Model-Fitting Approach to Cluster Validation with Application to Stochastic Model-Based Image Segmentation , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Hichem Frigui,et al.  A robust clustering algorithm based on competitive agglomeration and soft rejection of outliers , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  R. Krishnapuram,et al.  Fuzzy and robust formulations of maximum-likelihood-based Gaussian mixture decomposition , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[10]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[11]  H. Akaike A new look at the statistical model identification , 1974 .

[12]  B. Everitt,et al.  Finite Mixture Distributions , 1981 .

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .