Pseudo-stochastic EM for sub-Gaussian α-stable mixture models

Abstract Due to the non-existence of a closed-form expression for sub-Gaussian α-stable densities, the M-step of the Expectation-Maximization (EM) algorithm for the Sub-Gaussian α-Stable Mixture Models (SGαSMMs) is intractable, and the EM algorithm for SGαSMM is still an open problem. SGαSMM can model non-homogeneous Gaussian data, accommodate outliers, and high leverage data points, which are concepts of primary importance in robust mixture models. These models are robust and useful tools in modeling heterogeneous data with outlier observations, such as clustering financial or impulsive data. In this paper, a new EM algorithm based on a combination of EM (the first part) and stochastic EM (the second part) algorithms, is used to obtain the maximum likelihood estimators of the parameters of SGαSMM in the M-step. In the first part, the model parameters, except αs, are estimated from an analytical form via EM. In the second part, based on a stochastic EM, the maximum likelihood estimator of α, in each component, is calculated from pseudo-simulated data obtained by suitable rejection sampling. The efficiency of the proposed algorithm is illustrated by using both real and simulated data.

[1]  Geoffrey J. McLachlan,et al.  Robust mixture modelling using the t distribution , 2000, Stat. Comput..

[2]  Harish Bhaskar,et al.  Video Foreground Detection Based on Symmetric Alpha-Stable Mixture Models , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  D. Applebaum Stable non-Gaussian random processes , 1995, The Mathematical Gazette.

[4]  P. Green,et al.  Corrigendum: On Bayesian analysis of mixtures with an unknown number of components , 1997 .

[5]  Neil Gershenfeld,et al.  Nonlinear Inference and Cluster‐Weighted Modeling , 1997 .

[6]  Diego P. Ruiz,et al.  Modelling with mixture of symmetric stable distributions using Gibbs sampling , 2010, Signal Process..

[7]  John P. Nolan,et al.  Parameterizations and modes of stable distributions , 1998 .

[8]  William J. Fitzgerald,et al.  Approximation of α-stable probability densities using finite Gaussian mixtures , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[9]  A. Bowman,et al.  A look at some data on the old faithful geyser , 1990 .

[10]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[11]  S. Crawford An Application of the Laplace Method to Finite Mixture Distributions , 1994 .

[12]  M. E. Johnson,et al.  A Family of Distributions for Modelling Non‐Elliptically Symmetric Multivariate Data , 1981 .

[13]  É. Moulines,et al.  Convergence of a stochastic approximation version of the EM algorithm , 1999 .

[14]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[15]  Wan-Lun Wang,et al.  Maximum likelihood inference for the multivariate t mixture model , 2016, J. Multivar. Anal..

[16]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  Y. Bechtel,et al.  A population and family study N‐acetyltransferase using caffeine urinary metabolites , 1993, Clinical pharmacology and therapeutics.

[19]  Zuqiang Qiou,et al.  Monte Carlo EM estimation for multivariate stable distributions , 1999 .

[20]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[21]  Diego P. Ruiz,et al.  Finite mixture of alpha Stable distributions , 2007 .

[22]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[23]  Arnošt Komárek,et al.  Capabilities of R Package mixAK for Clustering Based on Multivariate Continuous and Discrete Longitudinal Data , 2014 .

[24]  Svetlozar T. Rachev,et al.  Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns , 2009 .

[25]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[26]  S. Godsill,et al.  Bayesian inference for time series with heavy-tailed symmetric α-stable noise processes , 1999 .

[27]  EM algorithm and variants: an informal tutorial , 2011, 1105.1476.

[28]  Diego Salas-Gonzalez,et al.  Estimation of Mixtures of Symmetric Alpha Stable Distributions With an Unknown Number of Components , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[29]  K. Roeder Density estimation with confidence sets exemplified by superclusters and voids in the galaxies , 1990 .

[30]  C. L. Nikias,et al.  Signal processing with fractional lower order moments: stable processes and their applications , 1993, Proc. IEEE.

[31]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[32]  John P. Nolan,et al.  Multivariate elliptically contoured stable distributions: theory and estimation , 2013, Computational Statistics.

[33]  J. Nolan,et al.  Maximum likelihood estimation and diagnostics for stable distributions , 2001 .

[34]  M. Teimouri,et al.  Robust mixture modelling using sub-Gaussian stable distribution , 2017, 1701.06749.

[35]  Adel Mohammadpour,et al.  Parameter Estimation Using the EM Algorithm for Symmetric Stable Random Variables and Sub-Gaussian Random Vectors , 2018, J. Stat. Theory Appl..

[36]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[37]  D. Buckle Bayesian Inference for Stable Distributions , 1995 .

[38]  Adel Mohammadpour,et al.  EM algorithm for symmetric stable mixture model , 2018, Commun. Stat. Simul. Comput..

[39]  M. Degroot,et al.  Modeling lake-chemistry distributions: approximate Bayesian methods for estimating a finite-mixture model , 1992 .

[40]  Salvatore Ingrassia,et al.  Model-based clustering via linear cluster-weighted models , 2012, Comput. Stat. Data Anal..

[41]  C. Genest,et al.  Statistical Inference Procedures for Bivariate Archimedean Copulas , 1993 .

[42]  Paul D. McNicholas,et al.  Robust Clustering in Regression Analysis via the Contaminated Gaussian Cluster-Weighted Model , 2014, J. Classif..