Robust M-Estimation Based Bayesian Cluster Enumeration for Real Elliptically Symmetric Distributions

Robustly determining the optimal number of clusters in a data set is an essential factor in a wide range of applications. Cluster enumeration becomes challenging when the true underlying structure in the observed data is corrupted by heavy-tailed noise and outliers. Recently, Bayesian cluster enumeration criteria have been derived by formulating cluster enumeration as maximization of the posterior probability of candidate models. This article generalizes robust Bayesian cluster enumeration so that it can be used with any arbitrary Real Elliptically Symmetric (RES) distributed mixture model. Our framework also covers the case of M-estimators that allow for mixture models, which are decoupled from a specific probability distribution. Examples of Huber's and Tukey's M-estimators are discussed. We derive a robust criterion for data sets with finite sample size, and also provide an asymptotic approximation to reduce the computational cost at large sample sizes. The algorithms are applied to simulated and real-world data sets, including radar-based person identification, and show a significant robustness improvement in comparison to existing methods.

[1]  Peter Filzmoser,et al.  Robust fitting of mixtures using the trimmed likelihood estimator , 2007, Comput. Stat. Data Anal..

[2]  Michael Muma,et al.  Bayesian Target Enumeration and Labeling Using Radar Data of Human Gait , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[3]  Philippe Forster,et al.  Covariance Structure Maximum-Likelihood Estimates in Compound Gaussian Noise: Existence and Algorithm Analysis , 2008, IEEE Transactions on Signal Processing.

[4]  Gunter Ritter,et al.  Using combinatorial optimization in model-based trimmed clustering with cardinality constraints , 2010, Comput. Stat. Data Anal..

[5]  M. Gallegos,et al.  Trimming algorithms for clustering contaminated grouped data and their robustness , 2009, Adv. Data Anal. Classif..

[6]  Jan R. Magnus,et al.  On the concept of matrix derivative , 2010, J. Multivar. Anal..

[7]  Min Wang,et al.  Thresher: determining the number of clusters while removing outliers , 2018, BMC Bioinformatics.

[8]  David E. Tyler,et al.  A curious likelihood identity for the multivariate t-distribution , 1994 .

[9]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 1991 .

[10]  J. Cavanaugh,et al.  Generalizing the derivation of the schwarz information criterion , 1999 .

[11]  Hisayuki Tsukuma,et al.  Matrix Algebra , 2018, Invitation to Linear Programming and Game Theory.

[12]  Junyan Liu,et al.  Regularized robust estimation of mean and covariance matrix for incomplete data , 2019, Signal Process..

[13]  Fulvio Gini,et al.  The Misspecified Cramer-Rao Bound and Its Application to Scatter Matrix Estimation in Complex Elliptically Symmetric Distributions , 2016, IEEE Transactions on Signal Processing.

[14]  David E. Tyler,et al.  Simultaneous penalized M-estimation of covariance matrices using geodesically convex optimization , 2016, 1608.08126.

[15]  E. Ziegel Matrix Differential Calculus With Applications in Statistics and Econometrics , 1989 .

[16]  Michael Muma,et al.  Gravitational Clustering: A Simple, Robust and Adaptive Approach for Distributed Networks , 2018, Signal Process..

[17]  Jean-Yves Tourneret,et al.  Parameter Estimation For Multivariate Generalized Gaussian Distributions , 2013, IEEE Transactions on Signal Processing.

[18]  A. Raftery,et al.  Detecting features in spatial point processes with clutter via model-based clustering , 1998 .

[19]  Visa Koivunen,et al.  Model order selection , 2014 .

[20]  Paul D. McNicholas,et al.  Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions , 2011, Statistics and Computing.

[21]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22]  Michael Muma,et al.  Bayesian Cluster Enumeration Criterion for Unsupervised Learning , 2017, IEEE Transactions on Signal Processing.

[23]  S. Sahu,et al.  A new class of multivariate skew distributions with applications to bayesian regression models , 2003 .

[24]  Abdelhak M. Zoubir,et al.  Semiparametric CRB and Slepian-Bangs Formulas for Complex Elliptically Symmetric Distributions , 2019, IEEE Transactions on Signal Processing.

[25]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[26]  C. Hennig,et al.  Clustering by Optimizing the Average Silhouette Width , 2019, arXiv.org.

[27]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[28]  Matthieu Jonckheere,et al.  A Flexible EM-Like Clustering Algorithm for Noisy Data , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Michael Muma,et al.  Novel Bayesian Cluster Enumeration Criterion for Cluster Analysis with Finite Sample Penalty Term , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[31]  Esa Ollila,et al.  Regularized $M$ -Estimators of Scatter Matrix , 2014, IEEE Transactions on Signal Processing.

[32]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[33]  Petar M. Djuric,et al.  Asymptotic MAP criteria for model selection , 1998, IEEE Trans. Signal Process..

[34]  Paul D. McNicholas,et al.  Clustering gene expression time course data using mixtures of multivariate t-distributions , 2012 .

[35]  F. Pascal,et al.  Robust Semiparametric Efficient Estimators in Elliptical Distributions , 2020, 2002.02239.