Ch. 24. Goodness-of-fit tests for univariate and multivariate normal models

The assumption of univariate and multivariate normality is implicit in most of the statistical procedures routinely used in the analysis of univariate and multivariate data. Now it is well recognized that, in general, the assumption of normality is at best suspect; e.g., see Geary (1947) , Pearson (1929) , Jeffreys (1961) , Mudholkar and Srivastava (2000a) and references therein. Furthermore, it is also well established that when the assumption of normality is violated most of the normal theory procedures lose validity, i.e., Type I error control, or become highly inefficient in terms of power. Numerous goodness-of-fit methods to test the assumption of univariate normality exist in the literature but no single test uniformly dominates all others. However, several theoretical and simulation justifications published in the literature indicate that the Shapiro-Wilk test is reasonable and appropriate in most situations of practical importance. The assumption of multivariate normality is harder to expect and justify since it implies joint normality, in addition to the marginal normality, of the components. This structural complexity may be a reason for a time lag in the development of goodness-of-fit tests for multivariate normality. However, the last two decades have seen advances leading to several competing tests of multivariate normality. In addition, it is seen that, as compared with the univariate methods, the multivariate data analysis methods are more prone to becoming invalid in terms of Type I error control and inefficient in terms of power when the normality assumption is violated. The purpose of this article is to present an overview of the methods for testing univariate and multivariate normality and to indicate their relative strengths and weaknesses.

[1]  C. Quesenberry,et al.  Conditional Probability Integral Transformations and Goodness-of-Fit Tests for Multivariate Normal Distributions , 1979 .

[2]  N. J. H. Small Marginal Skewness and Kurtosis in Testing Multivariate Normality , 1980 .

[3]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[4]  R. D'Agostino Transformation to normality of the null distribution of g1 , 1970 .

[5]  A. Madansky Prescriptions for working statisticians , 1988 .

[6]  N. L. Johnson,et al.  Systems of frequency curves generated by methods of translation. , 1949, Biometrika.

[7]  Ganapati P. Patil,et al.  Statistical Distributions in Scientific Work , 1981 .

[8]  Ronald Schrader,et al.  Robust analysis of variance , 1977 .

[9]  A. Afifi,et al.  On Tests for Multivariate Normality , 1973 .

[10]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[11]  R. D'Agostino An omnibus test of normality for moderate and large size samples , 1971 .

[12]  Ralph B. D'Agostino,et al.  Goodness-of-Fit-Techniques , 2020 .

[13]  G. S. Mudholkar,et al.  A Graphical Procedure for Comparing Goodness-of-fit Tests , 1991 .

[14]  Stephen Warwick Looney,et al.  A comparison of tests for multivariate normality that are based on measures of multivariate skewness and kurtosis , 1992 .

[15]  G. E. Thomas,et al.  Remark AS R19 and Algorithm AS 109: A Remark on Algorithms: AS 63: The Incomplete Beta Integral AS 64: Inverse of the Incomplete Beta Function Ratio , 1977 .

[16]  George P. H. Styan,et al.  Selected Tables in Mathematical Statistics , 1971 .

[17]  M. Healy,et al.  Multivariate Normal Plotting , 1968 .

[18]  J. Oosterhoff Combination of one-sided statistical tests , 1969 .

[19]  Calyampudi R. Rao,et al.  Linear Statistical Inference and Its Applications. , 1975 .

[20]  Narayanaswamy Balakrishnan Advances on Theoretical and Methodological Aspects of Probability and Statistics , 2003 .

[21]  James A. Koziol,et al.  A class of invariant procedures for assessing multivariate normality , 1982 .

[22]  Deo Kumar Srivastava,et al.  A test of p-variate normality , 1992 .

[23]  Ramon C. Littell,et al.  Asymptotic Optimality of Fisher's Method of Combining Independent Tests , 1971 .

[24]  G. S. Mudholkar,et al.  Robust analogs of hotelling's two-sample t2 , 2000 .

[25]  G. S. Mudholkar,et al.  The Elusive and Illusory Multivariate Normality , 2003 .

[26]  James A. Koziol,et al.  On Assessing Multivariate Normality , 1983 .

[27]  P. R. Krishnaiah Multivariate Analysis IV , 1977 .

[28]  Paul K. Johnson,et al.  Bradykininogen levels in Hodgkin's disease , 1968, Cancer.

[29]  Egon S. Pearson,et al.  Some problems arising in approximating to probability distributions, using moments , 1963 .

[30]  S. Stigler Do Robust Estimators Work with Real Data , 1977 .

[31]  V. A. Uthoff An Optimum Test Property of Two Weil-Known Statistics , 1970 .

[32]  L. Shenton,et al.  Omnibus test contours for departures from normality based on √b1 and b2 , 1975 .

[33]  S. Csörgo Testing for independence by the empirical characteristic function , 1985 .

[34]  E. S. Pearson,et al.  Tests for departure from normality. Empirical results for the distributions of b2 and √b1 , 1973 .

[35]  Deo Kumar Srivastava,et al.  Assessing the significance of difference between two quick estimates of location , 1992 .

[36]  Rupert G. Miller Beyond ANOVA, basics of applied statistics , 1987 .

[37]  F. Mosteller On Some Useful "Inefficient" Statistics , 1946 .

[38]  Y. A. Hegazy,et al.  Powerful Modified-EDF Goodness-of-Fit Tests , 1976 .

[39]  R. Gnanadesikan,et al.  Probability plotting methods for the analysis of data. , 1968, Biometrika.

[40]  S. Shapiro,et al.  An Approximate Analysis of Variance Test for Normality , 1972 .

[41]  Charles E. Antle,et al.  Likelihood Ratio Test for DiscriminaGon Between Two Models with Unknown Location and Scale Parameters , 1973 .

[42]  Jorge Luis Romeu,et al.  A comparative study of goodness-of-fit tests for multivariate normality , 1993 .

[43]  Z. Govindarajulu,et al.  A modification of the test of Shapiro and Wilk for normality , 1997 .

[44]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[45]  W. R. Buckland,et al.  Contributions to Probability and Statistics , 1960 .

[46]  Carol Marchetti,et al.  Characterization Theorems and Goodness-of-Fit Tests , 2002 .

[47]  J. Royston Some Techniques for Assessing Multivarate Normality Based on the Shapiro‐Wilk W , 1983 .

[48]  R. Geary THE RATIO OF THE MEAN DEVIATION TO THE STANDARD DEVIATION AS A TEST OF NORMALITY , 1935 .

[49]  N. J. H. Small Plotting squared radii , 1978 .

[50]  K. Mardia,et al.  Omnibus tests of multinormality based on skewness and kurtosis , 1983 .

[51]  M. Freimer,et al.  Extremes,extreme spacings and outliers in the tukey and weibull families , 1989 .

[52]  H. Levene Robust tests for equality of variances , 1961 .

[53]  F. Downton,et al.  Linear estimates with polynomial coefficients. , 1966, Biometrika.

[54]  Deo Kumar Srivastava,et al.  Trimmed T̃2: A robust analog of hotelling's T2 , 2001 .

[55]  Kai-Tai Fang,et al.  A test for multivariate normality based on sample entropy and projection pursuit , 1995 .

[56]  Calyampudi R. Rao,et al.  Tests of significance in multivariate analysis. , 1948, Biometrika.

[57]  A. M. Hasofer,et al.  Testing for multivariate normality after coordinate transformation , 1990 .

[58]  E. S. Pearson Some Aspects of the Geometry of Statistics: The Use of Visual Presentation in Understanding the Theory and Application of Mathematical Statistics , 1956 .

[59]  Ching-Chuong Lin,et al.  A simple test for normality against asymmetric alternatives , 1980 .

[60]  Govind S. Mudholkar,et al.  The Logit Statistic for Combining Probabilities - An Overview , 1977 .

[61]  E. S. Pearson Biometrika tables for statisticians , 1967 .

[62]  T. M. Williams,et al.  Optimizing Methods in Statistics , 1981 .

[63]  H. A. David,et al.  THE DISTRIBUTION OF THE RATIO, IN A SINGLE NORMAL SAMPLE, OF RANGE TO STANDARD DEVIATION , 1954 .

[64]  V. A. Uthoff,et al.  The Most Powerful Scale and Location Invariant Test of the Normal Versus the Double Exponential , 1973 .

[65]  K. Pearson ON A METHOD OF DETERMINING WHETHER A SAMPLE OF SIZE n SUPPOSED TO HAVE BEEN DRAWN FROM A PARENT POPULATION HAVING A KNOWN PROBABILITY INTEGRAL HAS PROBABLY BEEN DRAWN AT RANDOM , 1933 .

[66]  H. Hartley,et al.  The maximum F-ratio as a short-cut test for heterogeneity of variance. , 1950, Biometrika.

[67]  Edward J. Dudewicz,et al.  A New Statistical Goodness‐of‐Fit Test Based on Graphical Representation , 1992 .

[68]  Pin T. Ng Smoothing Spline Score Estimation , 1994, SIAM J. Sci. Comput..

[69]  Stephen Warwick Looney,et al.  Diagnostic limitations of skewness coefficients in assessing departures from univariate and multivariate normality , 1993 .

[70]  F. David,et al.  Statistical Estimates and Transformed Beta-Variables. , 1960 .

[71]  Muni S. Srivastava,et al.  A measure of skewness and kurtosis and a graphical method for assessing multivariate normality , 1984 .

[72]  N. Bingham Studies in the history of probability and statistics XLVI. Measure into probability: from Lebesgue to Kolmogorov , 2000 .

[73]  A. Öztürk,et al.  A New Graphical Test for Multivariate Normality , 1996 .

[74]  K. Mardia Measures of multivariate skewness and kurtosis with applications , 1970 .

[75]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[76]  Anil K. Bera,et al.  Tests for multivariate normality with Pearson alternatives , 1983 .

[77]  Norbert Henze,et al.  A New Approach to the BHEP Tests for Multivariate Normality , 1997 .

[78]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[79]  J. Hammersley,et al.  THE ESTIMATION OF LOCATION AND SCALE PARAMETERS FROM GROUPED DATA , 1954 .

[80]  M. Layard,et al.  Robust Large-Sample Tests for Homogeneity of Variances , 1973 .

[81]  Morton B. Brown,et al.  The Small Sample Behavior of Some Statistics Which Test the Equality of Several Means , 1974 .

[82]  R. Plackett Linear Estimation from Censored Data , 1958 .

[83]  Bradford F. Kimball,et al.  On the Choice of Plotting Positions on Probability Paper , 1960 .

[84]  Deo Kumar Srivastava,et al.  Some p-variate adaptations of the shapiro-wilk test of normality , 1995 .

[85]  J. Filliben The Probability Plot Correlation Coefficient Test for Normality , 1975 .

[86]  Sucharita Ghosh A New Graphical Tool to Detect Non-Normality , 1996 .

[87]  M. Srivastava,et al.  On assessing multivariate normality based on shapiro-wilk W statistic , 1987 .

[88]  Dayanand N. Naik,et al.  Applied Multivariate Statistics with SAS Software , 1997 .

[89]  Egon S. Pearson,et al.  THE DISTRIBUTION OF FREQUENCY CONSTANTS IN SMALL SAMPLES FROM NON-NORMAL SYMMETRICAL AND SKEW POPULATIONS , 1929 .

[90]  G. Box NON-NORMALITY AND TESTS ON VARIANCES , 1953 .

[91]  H. Jeffreys,et al.  The Theory of Probability , 1896 .

[92]  E. S. Pearson,et al.  Tests for departure from normality: Comparison of powers , 1977 .

[93]  K. Mardia Assessment of multinormality and the robustness of Hotelling's T^2 test , 1975 .

[94]  Tests for normality using estimated score function , 1995 .

[95]  G. S. Mudholkar,et al.  Testing significance of a mean vector—A possible alternative to Hotelling'sT2 , 1980 .

[96]  N. Henze,et al.  A consistent test for multivariate normality based on the empirical characteristic function , 1988 .

[97]  J. Royston An Extension of Shapiro and Wilk's W Test for Normality to Large Samples , 1982 .

[98]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[99]  E. B. Wilson,et al.  The Distribution of Chi-Square. , 1931, Proceedings of the National Academy of Sciences of the United States of America.

[100]  G. S. Mudholkar,et al.  A construction and appraisal of pooled trimmed-t statistics , 1991 .

[101]  Stephen Warwick Looney,et al.  How to Use Tests for Univariate Normality to Assess Multivariate Normality , 1995 .

[102]  Marshall Freimer,et al.  a study of the generalized tukey lambda family , 1988 .

[103]  T. W. Epps,et al.  A test for normality based on the empirical characteristic function , 1983 .

[104]  Runze Li,et al.  A multivariate version of Ghosh's T 3 -plot to detect non-multinormality , 1998 .

[105]  M. Bartlett Properties of Sufficiency and Statistical Tests , 1992 .

[106]  A. Gupta,et al.  ESTIMATION OF THE MEAN AND STANDARD DEVIATION OF A NORMAL POPULATION FROM A CENSORED SAMPLE , 1952 .

[107]  R. Geary,et al.  Testing for Normality , 2003 .

[108]  M. Layard,et al.  Large Sample Tests for the Equality of Two Covariance Matrices , 1972 .

[109]  W. G. Cochran The distribution of the largest of a set of estimated variances as a fraction of their total , 1941 .