When a set of patterns is collected for pattern recognition, the number of major clusters may not be known, and the set contains the outliers. In this paper, a method is proposed that can estimate appropriately the number of major clusters. When a model that describes the distribution of patterns is defined, the maximum-likelihood estimation can be applied to the parameter estimation, and the number of parameters can be optimized by the Akaike information criterion (AIC) or the minimum description length (MDL). Then the number of clusters can be estimated. When the set of patterns contains the outliers, however, they affect the parameter estimation, and, accordingly, the estimation of the number of clusters. This paper also proposes two robust clustering methods (MARC1 and MARC2) based on the maximum-likelihood method for the multivariate mixture normal distribution model, aiming at the reduction of the effect of the outliers. The number of clusters is estimated by AIC and MDL using the parameters obtained as a result of the clustering. The experimental results show that even if 45 percent of the patterns in each cluster are replaced by the outliers, their effects on the parameter estimation can be reduced and the adequate number of clusters can be estimated. The limit of the application of the proposed method is investigated. Then the result of application to the region segmentation is presented.
[1]
J. Wolfe.
PATTERN CLUSTERING BY MULTIVARIATE MIXTURE ANALYSIS.
,
1970,
Multivariate behavioral research.
[2]
Robert M. Gray,et al.
An Algorithm for Vector Quantizer Design
,
1980,
IEEE Trans. Commun..
[3]
N. Campbell.
Robust Procedures in Multivariate Analysis I: Robust Covariance Estimation
,
1980
.
[4]
J. Rissanen.
A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH
,
1983
.
[5]
Fatos T. Yarman-Vural,et al.
Noise, histogram and cluster validity for Gaussian-mixtured data
,
1987,
Pattern Recognit..
[6]
P. Rousseeuw,et al.
Unmasking Multivariate Outliers and Leverage Points
,
1990
.
[7]
Michael B. Merickel,et al.
Supervising ISODATA with an information theoretic stopping rule
,
1990,
Pattern Recognit..
[8]
Jean-Michel Jolion,et al.
Robust Clustering with Applications in Computer Vision
,
1991,
IEEE Trans. Pattern Anal. Mach. Intell..
[9]
Minoru Asada,et al.
Active contour extraction based on region descriptions obtained from clustering
,
1993
.