Preparatory data analyses (data screening) are conducted before a main analysis to assess the fit between the data and the assumptions of that main analysis. Different main analyses have different assumptions that vary in importance; violation of some assumptions can lead to the wrong inferential conclusion (and a potential failure of replication) while violation of others yields an analysis that is correct as far as it goes, but misses certain additional relationships in the data. Assumptions that are often relevant for continuous variables are normality of sampling distributions, pairwise linearity, absence of outliers and collinearity, independence of errors, and homoscedasticity; these are evaluated by both graphical and statistical methods. When violation is detected, variables are often transformed or an alternative analytic strategy is employed. Relevant issues in the choice of when and how to screen are the level of measurement of the variables, whether the design produces grouped or ungrouped data, whether cases provide a single response or more than one response, and whether the variables themselves or the residuals of analysis are screened.
Keywords:
assumptions;
collinearity;
distributions;
errors;
homoscedasticity;
outliers;
residuals;
screening;
transformation
[1]
J. Mauchly.
Significance Test for Sphericity of a Normal $n$-Variate Distribution
,
1940
.
[2]
G. W. Milligan,et al.
Factors that affect Type I and Type II error rates in the analysis of multidimensional contingency tables.
,
1980
.
[3]
Chester L. Olson,et al.
Practical considerations in choosing a MANOVA test statistic: A rejoinder to Stevens.
,
1979
.
[4]
John C. Lind,et al.
Two-sample T–2 procedure and the assumption of homogeneous covariance matrices.
,
1979
.
[5]
H. Huynh,et al.
Estimation of the Box Correction for Degrees of Freedom from Sample Data in Randomized Block and Split-Plot Designs
,
1976
.
[6]
James V. Bradley,et al.
The insidious L-shaped distribution.
,
1982
.
[7]
J. Simonoff,et al.
Procedures for the Identification of Multiple Outliers in Linear Models
,
1993
.
[8]
Leland Wilkinson,et al.
Statistical Methods in Psychology Journals Guidelines and Explanations
,
2005
.
[9]
S. Geisser,et al.
On methods in the analysis of profile data
,
1959
.
[10]
P. Rousseeuw,et al.
Unmasking Multivariate Outliers and Leverage Points
,
1990
.
[11]
S. Morgan,et al.
Outlier detection in multivariate analytical chemical data.
,
1998,
Analytical chemistry.