On point estimation of the abnormality of a Mahalanobis index

Mahalanobis distance may be used as a measure of the disparity between an individual’s profile of scores and the average profile of a population of controls. The degree to which the individual’s profile is unusual can then be equated to the proportion of the population who would have a larger Mahalanobis distance than the individual. Several estimators of this proportion are examined. These include plug-in maximum likelihood estimators, medians, the posterior mean from a Bayesian probability matching prior, an estimator derived from a Taylor expansion, and two forms of polynomial approximation, one based on Bernstein polynomial and one on a quadrature method. Simulations show that some estimators, including the commonly-used plug-in maximum likelihood estimators, can have substantial bias for small or moderate sample sizes. The polynomial approximations yield estimators that have low bias, with the quadrature method marginally to be preferred over Bernstein polynomials. However, the polynomial estimators sometimes yield infeasible estimates that are outside the 0–1 range. While none of the estimators are perfectly unbiased, the median estimators match their definition; in simulations their estimates of the proportion have a median error close to zero. The standard median estimator can give unrealistically small estimates (including 0) and an adjustment is proposed that ensures estimates are always credible. This latter estimator has much to recommend it when unbiasedness is not of paramount importance, while the quadrature method is recommended when bias is the dominant issue.

[1]  B. Reiser CONFIDENCE INTERVALS FOR THE MAHALANOBIS DISTANCE , 2001 .

[2]  Kimberly G. Smith,et al.  A multivariate model of female black bear habitat use for a geographic information system , 1993 .

[3]  Andrew L. Rukhin Estimation of the noncentrality parameter of an F-distribution☆ , 1993 .

[4]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[5]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[6]  Fadlalla G. Elfadaly,et al.  Modified confidence intervals for the Mahalanobis distance , 2017 .

[7]  Qihao Weng,et al.  Enhancing temporal resolution of satellite imagery for public health studies: A case study of West Nile Virus outbreak in Los Angeles in 2007 , 2012 .

[8]  Lee J. Bain,et al.  Moments of a Noncentral t and Noncentral F-distribution , 1969 .

[9]  Frank E. Grubbs,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[10]  Kamon Budsaba,et al.  Least Squares Method of Estimation Using Bernstein Polynomials for Density Estimation , 2013 .

[11]  Paul H Garthwaite,et al.  Testing for suspected impairments and dissociations in single-case studies in neuropsychology: evaluation of alternatives using monte carlo simulations and revised tests for dissociations. , 2005, Neuropsychology.

[12]  P. Gemperline,et al.  Combination of the Mahalanobis distance and residual variance pattern recognition techniques for classification of near-infrared reflectance spectra , 1990 .

[13]  Peng Xiao,et al.  Hotelling's T2 multivariate profiling for detecting differential expression in microarrays , 2005, Bioinform..

[14]  Yogendra P. Chaubey,et al.  Application of Bernstein Polynomials for smooth estimation of a distribution and density function , 2002 .

[15]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[16]  Mo Suk Chow A complete class theorem for estimating a noncentrality parameter , 1987 .

[17]  Paul H Garthwaite,et al.  Comparison of a single case to a control or normative sample in neuropsychology: Development of a Bayesian approach , 2007, Cognitive neuropsychology.

[18]  D. V. Lindley,et al.  An Introduction to Probability Theory and Its Applications. Volume II , 1967, The Mathematical Gazette.

[19]  G. Foody Sub-pixel methods in remote sensing , 2004 .

[20]  Alexandre Leblanc On estimating distribution functions using Bernstein polynomials , 2012 .

[21]  R. Garrett The chi-square plot: a tool for multivariate outlier recognition , 1989 .

[22]  David M. Rocke,et al.  The Distribution of Robust Distances , 2005 .

[23]  R. Kadmon,et al.  Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance , 2003 .

[24]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[25]  W. Woodall,et al.  Adapting control charts for the preliminary analysis of multivariate observations , 1998 .

[26]  Stephen G. Walker,et al.  Bayesian nonparametric inference of stochastically ordered distributions, with Pólya trees and Bernstein polynomials , 2007 .

[27]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[28]  Carl J. Huberty,et al.  Applied MANOVA and discriminant analysis , 2006 .

[29]  Hilde M. Huizenga,et al.  Multivariate normative comparisons , 2007, Neuropsychologia.

[30]  Chao A. Hsiung,et al.  Bayesian Survival Analysis Using Bernstein Polynomials , 2005 .

[31]  G. Phillips Interpolation and Approximation by Polynomials , 2003 .

[32]  R. Berk,et al.  Continuous Univariate Distributions, Volume 2 , 1995 .

[33]  Bradley C. Turnbull,et al.  Unimodal density estimation using Bernstein polynomials , 2014, Comput. Stat. Data Anal..

[34]  Ashok Sahai,et al.  A new computerizable quadrature formula using probabilistic approach , 2004, Appl. Math. Comput..

[35]  R. Mukerjee,et al.  Probability Matching Priors: Higher Order Asymptotics , 2004 .

[36]  W. Feller,et al.  An Introduction to Probability Theory and its Applications, Vol. II , 1967 .

[37]  P. Garthwaite,et al.  Investigation of the single case in neuropsychology: confidence limits on the abnormality of test scores and test score differences , 2002, Neuropsychologia.

[38]  . M.RaghunadhAcharya,et al.  An Efficient Polynomial Approximation to the Normal Distribution Function and Its Inverse Function , 2010 .