Robust Fuzzy Clustering via Trimming and Constraints

A methodology for robust fuzzy clustering is proposed. This methodology can be widely applied in very different statistical problems given that it is based on probability likelihoods. Robustness is achieved by trimming a fixed proportion of “most outlying” observations which are indeed self-determined by the data set at hand. Constraints on the clusters’ scatters are also needed to get mathematically well-defined problems and to avoid the detection of non-interesting spurious clusters. The main lines for computationally feasible algorithms are provided and some simple guidelines about how to choose tuning parameters are briefly outlined. The proposed methodology is illustrated through two applications. The first one is aimed at heterogeneously clustering under multivariate normal assumptions and the second one might be useful in fuzzy clusterwise linear regression problems.

[1]  P. Rousseeuw,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[2]  Jongwoo Kim,et al.  Application of the least trimmed squares technique to prototype-based clustering , 1996, Pattern Recognit. Lett..

[3]  Frank Klawonn Noise Clustering with a Fixed Fraction of Noise , 2004 .

[4]  Miin-Shen Yang On a class of fuzzy classification maximum likelihood procedures , 1993 .

[5]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  David David Maximum likelihood estimates of the parameters of a mixture of two regression lines , 1974 .

[7]  Rajesh N. Davé,et al.  Characterization and detection of noise in clustering , 1991, Pattern Recognit. Lett..

[8]  Alessio Farcomeni,et al.  Robust Methods for Data Reduction , 2015 .

[9]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[10]  Jacek M. Leski,et al.  Towards a robust fuzzy clustering , 2003, Fuzzy Sets Syst..

[11]  Miin-Shen Yang,et al.  Alternative Fuzzy Switching Regression , 2009 .

[12]  Luis Angel García-Escudero,et al.  Robust constrained fuzzy clustering , 2013, Inf. Sci..

[13]  Jan Karel Lenstra,et al.  Two Lines Least Squares , 1982 .

[14]  Luis Angel García-Escudero,et al.  A review of robust clustering methods , 2010, Adv. Data Anal. Classif..

[15]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[16]  Rajesh N. Davé,et al.  Robust clustering methods: a unified view , 1997, IEEE Trans. Fuzzy Syst..

[17]  Luis Angel García-Escudero,et al.  Computational Statistics and Data Analysis Robust Clusterwise Linear Regression through Trimming , 2022 .

[18]  Donald Gustafson,et al.  Fuzzy clustering with a fuzzy covariance matrix , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[19]  G. Ritter Robust Cluster Analysis and Variable Selection , 2014 .

[20]  Luis Angel García-Escudero,et al.  A fuzzy approach to robust regression clustering , 2016, Advances in Data Analysis and Classification.

[21]  Enrique H. Ruspini,et al.  A New Approach to Clustering , 1969, Inf. Control..

[22]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[23]  Alfonso Gordaliza Ramos,et al.  A general trimming approach to robust cluster analysis , 2007 .

[24]  P. Rousseeuw,et al.  Fuzzy clustering with high contrast , 1995 .

[25]  Miin-Shen Yang,et al.  Alternative c-means clustering algorithms , 2002, Pattern Recognit..

[26]  Peter J. Rousseeuw,et al.  Fuzzy clustering algorithms based on the maximum likelihood principle , 1991 .

[27]  Helmuth Späth,et al.  A fast algorithm for clusterwise linear regression , 1982, Computing.

[28]  Amit Banerjee,et al.  Robust clustering , 2012, WIREs Data Mining Knowl. Discov..

[29]  Peter J. Rousseeuw,et al.  Fuzzy clustering using scatter matrices , 1996 .

[30]  R.J. Hathaway,et al.  Switching regression models and fuzzy clustering , 1993, IEEE Trans. Fuzzy Syst..

[31]  Sadaaki Miyamoto,et al.  Fuzzy c-means as a regularization and maximum entropy approach , 1997 .

[32]  Luis Angel García-Escudero,et al.  A Fuzzy Approach to Robust Clusterwise Regression , 2016 .

[33]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .