The use of fractional polynomials to model continuous risk variables in epidemiology.

BACKGROUND The traditional method of analysing continuous or ordinal risk factors by categorization or linear models may be improved. METHODS We propose an approach based on transformation and fractional polynomials which yields simple regression models with interpretable curves. We suggest a way of presenting the results from such models which involves tabulating the risks estimated from the model at convenient values of the risk factor. We discuss how to incorporate several continuous risk and confounding variables within a single model. The approach is exemplified with data from the Whitehall I study of British Civil Servants. We discuss the approach in relation to categorization and non-parametric regression models. RESULTS We show that non-linear risk models fit the data better than linear models. We discuss the difficulties introduced by categorization and the advantages of the new approach. CONCLUSIONS Our approach based on fractional polynomials should be considered as an important alternative to the traditional approaches for the analysis of continuous variables in epidemiological studies.

[1]  E. Goetghebeur,et al.  Detection and Estimation of J-shaped Risk-Response Relationships , 1995 .

[2]  P. Royston,et al.  Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. , 1994 .

[3]  K. Burnham,et al.  Model selection: An integral part of inference , 1997 .

[4]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[5]  W. Sauerbrei,et al.  Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. , 1994, Journal of the National Cancer Institute.

[6]  R. Elswick,et al.  Interpretation of the odds ratio from logistic regression after a transformation of the covariate vector. , 1997, Statistics in medicine.

[7]  S Greenland,et al.  The impact of confounder selection criteria on effect estimation. , 1989, American journal of epidemiology.

[8]  S Greenland,et al.  Tests for trend and dose response: misinterpretations and alternatives. , 1992, American journal of epidemiology.

[9]  Lue Ping Zhao,et al.  Estimating Relative Risk Functions in Case-Control Studies Using a Nonparametric Logistic Regression , 1996 .

[10]  M. Marmot,et al.  INEQUALITIES IN DEATH—SPECIFIC EXPLANATIONS OF A GENERAL PATTERN? , 1984, The Lancet.

[11]  Willi Sauerbrei,et al.  The Use of Resampling Methods to Simplify Regression Models in Medical Statistics , 1999 .

[12]  G. Macfarlane,et al.  Some Statistical Considerations in the Analysis of Case‐Control Studies When the Exposure Variables Are Continuous Measurements , 1994, Epidemiology.

[13]  H Brenner,et al.  Controlling for Continuous Confounders in Epidemiologic Research , 1997, Epidemiology.

[14]  R du Berger,et al.  Flexible modeling of the effects of serum cholesterol on coronary heart disease mortality. , 1997, American journal of epidemiology.

[15]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[16]  C. Mulrow,et al.  The J-curve phenomenon and the treatment of hypertension. Is there a point beyond which pressure reduction is dangerous? , 1991, JAMA.

[17]  D. Hosmer,et al.  A comparison of goodness-of-fit tests for the logistic regression model. , 1997, Statistics in medicine.

[18]  S Greenland,et al.  Avoiding power loss associated with categorization and ordinal scores in dose-response and trend analysis. , 1995, Epidemiology.

[19]  C R Weinberg,et al.  How bad is categorization? , 1995, Epidemiology.

[20]  L. Dales,et al.  An improper use of statistical significance testing in studying covariables. , 1978, International journal of epidemiology.

[21]  P Maisonneuve,et al.  Interpretation and analysis of differential exposure variability and zero-exposure categories for continuous exposures. , 1995, Epidemiology.

[22]  C. Chatfield Model uncertainty, data mining and statistical inference , 1995 .

[23]  G A Colditz,et al.  Body fat distribution and risk of non-insulin-dependent diabetes mellitus in women. The Nurses' Health Study. , 1997, American journal of epidemiology.

[24]  N. E. Breslow Statistical Methods in Cancer Research , 1986 .

[25]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[26]  James O. Ramsay,et al.  Binomial Regression with Monotone Splines: A Psychometric Application , 1989 .

[27]  N. Breslow,et al.  The analysis of case-control studies , 1980 .

[28]  H. Becher,et al.  The concept of residual confounding in regression models and some applications. , 1992, Statistics in medicine.

[29]  G. Box,et al.  Transformation of the Independent Variables , 1962 .

[30]  P. Royston,et al.  Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials , 1999 .

[31]  M Schumacher,et al.  Outcome-oriented cutpoints in analysis of quantitative exposures. , 1994, American journal of epidemiology.

[32]  Robert M. Elashoff,et al.  Effect of Categorizing a Continuous Covariate on the Comparison of Survival Time , 1986 .

[33]  L. P. Zhao,et al.  Efficiency loss from categorizing quantitative exposures into qualitative exposures in case-control studies. , 1992, American journal of epidemiology.

[34]  S. Greenland Dose‐Response and Trend Analysis in Epidemiology: Alternatives to Categorical Analysis , 1995, Epidemiology.

[35]  N. Breslow,et al.  Statistical methods in cancer research. Vol. 1. The analysis of case-control studies. , 1981 .