Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling.

The relationship between a response variable and one or more continuous covariates is often curved. Attempts to represent curvature in singleor multiple-regression models are usually made by means of polynomials of the covariates, typically quadratics. However, low order polynomials offer a limited family of shapes, and high order polynomials may fit poorly at the extreme values of the covariates. We propose an extended family of curves, which we call fractional polynomials, whose power terms are restricted to a small predefined set of integer and non-integer values. The powers are selected so that conventional polynomials are a subset of the family. Regression models using fractional polynomials of the covariates have appeared in the literature in an ad hoc fashion over a long period; we provide a unified description and a degree of formalization for them. They are shown to have considerable flexibility and are straightforward to fit using standard methods. We suggest an iterative algorithm for covariate selection and model fitting when several covariates are available. We give six examples of the use of fractional polynomial models in three types of regression analysis: normal errors, logistic and Cox regression. The examples all relate to medical data: fetal measurements, immunoglobulin concentrations in children, diabetes in children, infertility in women, myelomatosis (a type of leukaemia) and leg ulcers.

[1]  Edmund Taylor Whittaker On a New Method of Graduation , 1922, Proceedings of the Edinburgh Mathematical Society.

[2]  L. Walford,et al.  Bioenergetics and Growth , 1947 .

[3]  F. Richards A Method of Maximum‐Likelihood Estimation , 1961 .

[4]  G. Box,et al.  Transformation of the Independent Variables , 1962 .

[5]  J. A. Nelder,et al.  Inverse Polynomials, a Useful Group of Multi-Factor Response Functions , 1966 .

[6]  C. Reinsch Smoothing by spline functions , 1967 .

[7]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[8]  D. Poirier Piecewise Regression Using Cubic Splines , 1973 .

[9]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[10]  J. Kent Information gain and a general measure of correlation , 1983 .

[11]  H B Valman,et al.  Serum immunoglobulin concentrations in preschool children measured by laser nephelometry: reference ranges for IgG, IgA, IgM. , 1983, Journal of clinical pathology.

[12]  C. J. Stone,et al.  Additive Regression and Other Nonparametric Models , 1985 .

[13]  B. Silverman,et al.  Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[14]  S. Weisberg Plots, transformations, and regression , 1985 .

[15]  D. Ruppert,et al.  Transformation and Weighting in Regression , 1988 .

[16]  Standard errors resilient to error variance misspecification , 1988 .

[17]  R. Simon,et al.  Flexible regression models with cubic splines. , 1989, Statistics in medicine.

[18]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[19]  A. Atkinson,et al.  Grouped Likelihood for the Shifted Power Transformation , 1991 .

[20]  T J Cole,et al.  Smoothing reference centile curves: the LMS method and penalized likelihood. , 1992, Statistics in medicine.

[21]  M. A. Moussa,et al.  Non‐Parametric Regression in Curve Fitting , 1992 .

[22]  D. Altman Construction of age-related reference centiles using absolute residuals. , 1993, Statistics in medicine.

[23]  D. Altman,et al.  Charts of fetal size: 3. Abdominal measurements , 1994, British journal of obstetrics and gynaecology.

[24]  I. Fentiman,et al.  Bone density of normal women in relation to endogenous and exogenous oestrogens. , 1994, British Journal of Rheumatology.