Type I Error Inflation in the Presence of a Ceiling Effect

Many variables in biomedical research (e.g., indices of health status) are measured with ceiling effects, in which a substantial number of subjects attain the highest possible scale value because the scale only discriminates among individuals in the low to moderate range. Furthermore, in social surveys, variables such as income and alcohol consumption may be subject to ceiling effects to protect the privacy and identity of those at the upper end of the distribution for a given variable. This article shows that if one attempts to control for such a variable using ordinary linear regression, and then test another independent variable that is actually unrelated to the outcome, the result can be an increase in the rate of Type I Error (false significance). We present simulations in which standard tests conducted at the 5%% significance level actually have the Type I error rates approaching 100%% for large samples. Statistical solutions are explored, but the best recommendation is to construct scales that are not subject to ceiling effects.

[1]  Gordon Johnston,et al.  Statistical Models and Methods for Lifetime Data , 2003, Technometrics.

[2]  Peter C Austin,et al.  Estimating linear regression models in the presence of a censored independent variable , 2004, Statistics in medicine.

[3]  A. Stewart,et al.  Methods of Constructing Health Measures , 1992 .

[4]  M. Escobar,et al.  The use of finite mixture models to estimate the distribution of the health utilities index in the presence of a ceiling effect , 2003 .

[5]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data. , 1983 .

[6]  J. Rehm,et al.  Measuring quantity, frequency, and volume of drinking. , 1998, Alcoholism, clinical and experimental research.

[7]  Peter C Austin,et al.  A comparison of methods for analyzing health-related quality-of-life measures. , 2002, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[8]  P. Sen,et al.  Effect of dichotomizinlg a continuous variable on the model structure in multiple linear regression models , 2000 .

[9]  D. Clayton,et al.  Multivariate generalizations of the proportional hazards model , 1985 .

[10]  M. Escobar,et al.  The use of the Tobit model for analyzing measures of health status , 2004, Quality of Life Research.

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  A. W. Kemp,et al.  Kendall's Advanced Theory of Statistics. , 1994 .

[13]  H. Becher,et al.  The concept of residual confounding in regression models and some applications. , 1992, Statistics in medicine.

[14]  P. McDonough,et al.  The influence of work, household structure, and social, personal and material resources on gender differences in health: an analysis of the 1994 Canadian National Population Health Survey. , 2002, Social science & medicine.

[15]  Gautam Tripathi,et al.  ECONOMETRIC METHODS , 2000, Econometric Theory.

[16]  R. Burnett,et al.  Air pollution and disability days in Toronto: results from the national population health survey. , 2002, Environmental research.

[17]  L. Pearlin,et al.  The structure of coping. , 1978, Journal of health and social behavior.

[18]  John DiNardo,et al.  Econometric methods. 4th ed. , 1997 .

[19]  J. Cairney,et al.  The effect of sociodemographics, social stressors, health status and psychosocial resources on the age-depression relationship. , 2000, Canadian journal of public health = Revue canadienne de sante publique.

[20]  R L Kane,et al.  Methodology for measuring health-state preferences--I: Measurement strategies. , 1989, Journal of clinical epidemiology.

[21]  S. Maxwell,et al.  Bivariate median splits and spurious statistical significance. , 1993 .

[22]  T. Wade,et al.  The Relationship between Physical Exercise and Distress in a National Sample of Canadians , 2000, Canadian journal of public health = Revue canadienne de sante publique.

[23]  M. Beaudet,et al.  The health of lone mothers. , 1999, Health reports.

[24]  Peter C Austin,et al.  Bayesian Extensions of the Tobit Model for Analyzing Measures of Health Status , 2002, Medical decision making : an international journal of the Society for Medical Decision Making.

[25]  H Brenner,et al.  Controlling for Continuous Confounders in Epidemiologic Research , 1997, Epidemiology.

[26]  J P Klein,et al.  Semiparametric estimation of random effects using the Cox model based on the EM algorithm. , 1992, Biometrics.

[27]  P. Macdonald,et al.  Regression Estimation from Grouped Observations , 1974 .

[28]  M. Brauer L'analyse des variables indépendantes continues et catégorielles: alternatives à la dichotomisation , 2002 .

[29]  Alastair Gray,et al.  Estimating Utility Values for Health States of Type 2 Diabetic Patients Using the EQ-5D (UKPDS 62) , 2002, Medical decision making : an international journal of the Society for Medical Decision Making.

[30]  C. D. Kemp,et al.  Kendall's Advanced Theory of Statistics, Vol. 1: Distribution Theory. , 1995 .

[31]  D. Cox,et al.  Analysis of Survival Data. , 1986 .

[32]  A. Satorra,et al.  Measurement Error Models , 1988 .