Statistical issues in the prospective monitoring of health outcomes across multiple units

Following several recent inquiries in the UK into medical malpractice and failures to deliver appropriate standards of health care, there is pressure to introduce formal monitoring of performance outcomes routinely throughout the National Health Service. Statistical process control (SPC) charts have been widely used to monitor medical outcomes in a variety of contexts and have been specifically advocated for use in clinical governance. However, previous applications of SPC charts in medical monitoring have focused on surveillance of a single process over time. We consider some of the methodological and practical aspects that surround the routine surveillance of health outcomes and, in particular, we focus on two important methodological issues that arise when attempting to extend SPC charts to monitor outcomes at more than one unit simultaneously (where a unit could be, for example, a surgeon, general practitioner or hospital): the need to acknowledge the inevitable between-unit variation in 'acceptable' performance outcomes due to the net effect of many small unmeasured sources of variation (e.g. unmeasured case mix and data errors) and the problem of multiple testing over units as well as time. We address the former by using quasi-likelihood estimates of overdispersion, and the latter by using recently developed methods based on estimation of false discovery rates. We present an application of our approach to annual monitoring 'all-cause' mortality data between 1995 and 2000 from 169 National Health Service hospital trusts in England and Wales. Copyright 2004 Royal Statistical Society.

[1]  Chris Sherlaw-Johnson,et al.  Monitoring the results of cardiac surgery by variable life-adjusted display , 1997, The Lancet.

[2]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[3]  N. Best,et al.  Following Shipman: a pilot system for monitoring mortality rates in primary care , 2003, The Lancet.

[4]  J. Stuart Hunter,et al.  The exponentially weighted moving average , 1986 .

[5]  Nicky Best,et al.  Comparison of UK paediatric cardiac surgical performance by analysis of routinely collected data 1984–96: was Bristol an outlier? , 2001, The Lancet.

[6]  Tom Treasure,et al.  Risk-adjusted sequential probability ratio tests: applications to Bristol, Shipman and adult cardiac surgery. , 2003, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[7]  Rupert G. Miller Simultaneous Statistical Inference , 1966 .

[8]  David R. Bickel Selecting an optimal rejection region for multiple testing: A decision-theoretic alternative to FDR control, with an application to microarrays , 2002 .

[9]  C L Christiansen,et al.  Improving the Statistical Approach to Health Care Provider Profiling , 1997, Annals of Internal Medicine.

[10]  Adrian F. M. Smith,et al.  Monitoring Kidney Transplant Patients , 1983 .

[11]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[12]  Tom Marshall,et al.  Bristol, Shipman, and clinical governance: Shewhart's forgotten lessons , 2001, The Lancet.

[13]  G. Teasdale Learning from Bristol: report of the public inquiry into children's heart surgery at Bristol Royal Infirmary 1984-1995 , 2002, British journal of neurosurgery.

[14]  Richard Horton,et al.  The real lessons from Harold Frederick Shipman , 2001, The Lancet.

[15]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[16]  Sarah Ramsay Surgeon struck off in Canada, disgraced in UK , 2000, The Lancet.

[17]  John D. Storey A direct approach to false discovery rates , 2002 .

[18]  I Heuch,et al.  A new sequential procedure for surveillance of Down's syndrome. , 1993, Statistics in medicine.

[19]  A. Bernstein,et al.  A method of uniform stratification of risk for evaluating the results of surgery in acquired adult heart disease. , 1989, Circulation.

[20]  D. Plamping,et al.  The new NHS. , 1991, BMJ.

[21]  G. Moustakides Optimal stopping times for detecting changes in distributions , 1986 .

[22]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[23]  Christopher R. Genovese,et al.  Operating Characteristics and Extensions of the FDR Procedure , 2001 .

[24]  L I Iezzoni,et al.  Explaining differences in English hospital death rates using routinely collected data , 1999, BMJ.

[25]  P Littlejohns,et al.  Cumulative risk adjusted mortality chart for detecting changes in death rate: observational study of heart surgery , 1998, BMJ.

[26]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[27]  G D Williamson,et al.  A study of the average run length characteristics of the National Notifiable Diseases Surveillance System. , 1999, Statistics in medicine.

[28]  Ronald B. Crosier,et al.  Fast Initial Response for CUSUM Quality-Control Schemes: Give Your CUSUM A Head Start.: Give Your CUSUM A Head Start. , 2000 .

[29]  M Frisén,et al.  Evaluations of methods for statistical surveillance. , 1992, Statistics in medicine.

[30]  David J. Spiegelhalter,et al.  Commissioned analysis of surgical performance using routine data: lessons from the Bristol inquiry , 2002 .

[31]  B. Blight,et al.  A Bayesian change-point problem with an application to the prediction and detection of ovulation in women. , 1981, Biometrics.

[32]  Nick Andrews,et al.  A Statistical Algorithm for the Early Detection of Outbreaks of Infectious Disease , 1996 .

[33]  G. Rossi,et al.  An approximate CUSUM procedure for surveillance of health events. , 1999, Statistics in medicine.

[34]  David Bock,et al.  A review and discussion of prospective statistical surveillance in public health , 2003 .

[35]  Maarten Boers,et al.  Time to review policy on contraindications to vaccination , 2000, The Lancet.

[36]  D. Northcott,et al.  The NHS Performance Assessment Framework: a "balanced scorecard" approach? , 2002, Journal of management in medicine.

[37]  G. D. Williamson,et al.  A monitoring system for detecting aberrations in public health surveillance reports. , 1999, Statistics in medicine.