Regression using Fractional Polynomials of Continuous Covariates: Parsimonious

SUMMARY The relationship between a response variable and one or more continuous covariates is often curved. Attempts to represent curvature in single- or multiple-regression models are usually made by means of polynomials of the covariates, typically quadratics. However, low order polynomials offer a limited family of shapes, and high order polynomials may fit poorly at the extreme values of the covariates. We propose an extended family of curves, which we call fractional polynomials, whose power terms are restricted to a small predefined set of integer and non-integer values. The powers are selected so that conventional polynomials are a subset of the family. Regression models using fractional polynomials of the covariates have appeared in the literature in an ad hoc fashion over a long period; we provide a unified description and a degree of formalization for them. They are shown to have considerable flexibility and are straightforward to fit using standard methods. We suggest an iterative algorithm for covariate selection and model fitting when several covariates are available. We give six examples of the use of fractional polynomial models in three types of regression analysis: normal errors, logistic and Cox regression. The examples all relate to medical data: fetal measurements, immunoglobulin concentrations in children, diabetes in children, infertility in women, myelomatosis (a type of leukaemia) and leg ulcers.