A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering

We propose a family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight. The originality comes from introducing multidimensional instead of univariate scale variables for the mixture of scaled Gaussian family of distributions. In contrast to most existing approaches, the derived distributions can account for a variety of shapes and have a simple tractable form with a closed-form probability density function whatever the dimension. We examine a number of properties of these distributions and illustrate them in the particular case of Pearson type VII and t tails. For these latter cases, we provide maximum likelihood estimation of the parameters and illustrate their modelling flexibility on simulated and real data clustering examples.

[1]  Mathias Drton,et al.  Robust graphical modeling of gene networks using classical and alternative t-distributions , 2010, 1009.3669.

[2]  Neil Shephard,et al.  From Characteristic Function to Distribution Function: A Simple Framework for the Theory , 1991, Econometric Theory.

[3]  R. Kohn,et al.  Flexible Multivariate Density Estimation With Marginal Adaptation , 2009, 0901.0225.

[4]  J. A. Cuesta-Albertos,et al.  Trimmed $k$-means: an attempt to robustify quantizers , 1997 .

[5]  Viktor Witkovský,et al.  On the Exact Computation of the Density and of the Quantiles of Linear Combinations of and Random Variables , 2001 .

[6]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[7]  A. McNeil,et al.  The t Copula and Related Copulas , 2005 .

[8]  Shy Shoham,et al.  Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions , 2002, Pattern Recognit..

[9]  A. McNeil,et al.  The Grouped t-Copula with an Application to Credit Risk , 2003 .

[10]  Samuel Kotz,et al.  Multitude of bivariate t distributions , 2004 .

[11]  O. Barndorff-Nielsen,et al.  Normal Variance-Mean Mixtures and z Distributions , 1982 .

[12]  Michel Verleysen,et al.  Robust Bayesian clustering , 2007, Neural Networks.

[13]  A. McNeil Multivariate t Distributions and Their Applications , 2006 .

[14]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[15]  B. Flury Common Principal Components in k Groups , 1984 .

[16]  Dimitris Karlis,et al.  Model-based clustering with non-elliptically contoured distributions , 2009, Stat. Comput..

[17]  Te-Won Lee,et al.  Multivariate Scale Mixture of Gaussians Modeling , 2006, ICA.

[18]  William T. Shaw,et al.  Bivariate Student t distributions with variable marginal degrees of freedom and independence , 2008 .

[19]  Radu Horaud,et al.  Conjugate Mixture Models for Clustering Multimodal Data , 2011, Neural Computation.

[20]  Geoffrey J. McLachlan,et al.  Robust mixture modelling using the t distribution , 2000, Stat. Comput..

[21]  M. Genton,et al.  Robust Likelihood Methods Based on the Skew‐t and Related Distributions , 2008 .

[22]  Cordelia Schmid,et al.  High-dimensional data clustering , 2006, Comput. Stat. Data Anal..

[23]  Ryan P. Browne,et al.  Orthogonal Stiefel manifold optimization for eigen-decomposed covariance parameter estimation in mixture models , 2012, Statistics and Computing.

[24]  Christian Barillot,et al.  A Weighted Multi-Sequence Markov Model For Brain Lesion Segmentation , 2010, AISTATS.

[25]  Christopher M. Bishop,et al.  Robust Bayesian Mixture Modelling , 2005, ESANN.

[26]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[27]  M. C. Jones A dependent bivariate t distribution with marginals on different degrees of freedom , 2002 .

[28]  Vasil Khalidov,et al.  Conjugate Mixture Models for the Modeling of Visual and Auditory Perception. (Modèles de Mélanges Conjugués pour la Modélisation de la Perception Visuelle et Auditive) , 2010 .

[29]  W. Gautschi,et al.  An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form , 1986 .

[30]  S. Kotz,et al.  The Meta-elliptical Distributions with Given Marginals , 2002 .

[31]  Saralees Nadarajah,et al.  Multitude of multivariate t-distributions , 2005 .

[32]  Paul D. McNicholas,et al.  Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions , 2011, Statistics and Computing.

[33]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..

[34]  Jon Barker,et al.  The CAVA corpus: synchronised stereoscopic and binaural datasets with head movements , 2008, ICMI '08.

[35]  Carlos Matrán,et al.  Robust estimation in the normal mixture model based on robust clustering , 2008 .