The signal in the noise: Robust detection of performance “outliers” in health services

Abstract To make the increasing amounts of data about the performance of public sector organisations digestible by decision makers, composite indicators are commonly constructed, from which a natural step is rankings and league tables. However, how much credence should be given to the results of such approaches? Studying English NHS maternity services (N = 130 hospital trusts), we assembled and used a set of 38 indicators grouped into four baskets of aspects of service delivery. In the absence of opinion on how the indicators should be aggregated, we focus on the uncertainty this brings to the composite results. We use a large two-stage Monte Carlo simulation to generate possible aggregation weights and examine the discrimination in the composite results. We find that positive and negative “outliers” can be identified robustly, of particular value to decision makers for investigation for learning or intervention, however results in between should be treated with great caution.

[1]  Duncan Shaw,et al.  Problem structuring methods for large group , 2004 .

[2]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[3]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[4]  Carl Macrae,et al.  Early warnings, weak signals and learning from healthcare disasters , 2014, BMJ quality & safety.

[5]  P. Meade The Royal College of Midwives. , 1996, Modern midwife.

[6]  Tom Marshall,et al.  Bristol, Shipman, and clinical governance: Shewhart's forgotten lessons , 2001, The Lancet.

[7]  Nathan Proudlove,et al.  A statistical investigation of inventory shrinkage in a large retail chain , 2007 .

[8]  Zuo-Jun Max Shen,et al.  On learning process of a newsvendor with censored demand information , 2016, J. Oper. Res. Soc..

[9]  H. Ansoff,et al.  Managing Strategic Surprise by Response to Weak Signals , 1975 .

[10]  D. Mccloskey,et al.  The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives , 2008 .

[11]  Abdullah Almasri,et al.  Was Rodney Ledward a statistical outlier?:Authors' reply , 2005, BMJ : British Medical Journal.

[12]  D. Spiegelhalter,et al.  Reliability of league tables of in vitro fertilisation clinics: retrospective analysis of live birth rates. , 1998, BMJ.

[13]  Ramanathan Gnanadesikan Methods for Statistical Data Analysis of Multivariate Observations: Gnanadesikan/Methods , 1997 .

[14]  S. Bird,et al.  Performance indicators: good, bad, and ugly , 2004 .

[15]  O. Sibony,et al.  Quality indicator development and implementation in maternity units. , 2013, Best practice & research. Clinical obstetrics & gynaecology.

[16]  Edwin Roland van Teijlingen,et al.  Risk, theory, social and medical models: a critical analysis of the concept of risk in maternity care. , 2010, Midwifery.

[17]  W. Hays Using Multivariate Statistics , 1983 .

[18]  H. Graham,et al.  Competing ideologies of reproduction: medical and maternal perspectives on pregnancy. , 1981 .

[19]  Duncan Shaw,et al.  Problem structuring methods for large group interventions , 2004, J. Oper. Res. Soc..

[20]  K. Walshe,et al.  Investigating consistent patterns of variation in short-notice cancellations of elective operations: The potential for learning and improvement through multi-site evaluations , 2018, Health services management research.

[21]  David J. Spiegelhalter,et al.  Statistical methods for healthcare regulation: rating, screening and surveillance , 2012 .

[22]  Geoff Royston,et al.  Operational Research for the Real World: big questions from a small island , 2013, J. Oper. Res. Soc..

[23]  David J Spiegelhalter,et al.  Funnel plots for comparing institutional performance , 2005, Statistics in medicine.

[24]  Michael Wood,et al.  User-friendly statistical concepts for process monitoring , 1998, J. Oper. Res. Soc..

[25]  Giles A. Hindle,et al.  Modelling and assessing local area differences in road casualties: a case study in England , 2009, J. Oper. Res. Soc..

[26]  Ruth Davies,et al.  Automating warm-up length estimation , 2008, 2008 Winter Simulation Conference.

[27]  Osman Balci,et al.  Verification, Validation, and Testing , 2007 .

[28]  Shizuhiko Nishisato,et al.  Elements of Dual Scaling: An Introduction To Practical Data Analysis , 1993 .

[29]  J. Klein,et al.  Health care: a case of hypercomplexity? , 2015 .

[30]  Michael Pidd,et al.  Measuring the Performance of Public Services: Principles and Practice , 2012 .

[31]  Nathan C. Proudlove,et al.  Cracking the rankings Part (i): Understanding the Financial Times MBA rankings , 2012, OR Insight.

[32]  Colin Talbot Theories of Performance: Organizational and Service Improvement in the Public Domain , 2010 .

[33]  Jim Freeman,et al.  Outliers in Statistical Data (3rd edition) , 1995 .

[34]  Jan Vanthienen,et al.  50 years of data mining and OR: upcoming trends and challenges , 2009, J. Oper. Res. Soc..

[35]  Harvey Goldstein,et al.  League Tables and Their Limitations: Statistical Issues in Comparisons of Institutional Performance , 1996 .

[36]  M A Mohammed,et al.  Plotting basic control charts: tutorial notes for healthcare practitioners , 2008, Quality & Safety in Health Care.

[37]  H. Goldstein,et al.  The limitations of using school league tables to inform school choice , 2009 .

[38]  Rowena Jacobs,et al.  How Do Performance Indicators Add Up? An Examination of Composite Indicators in Public Services , 2007 .

[39]  Martin Pitt,et al.  An analysis of the academic literature on simulation and modelling in health care , 2009, J. Simulation.

[40]  Stephen E. Fienberg,et al.  Discussion on the paper by Spiegelhalter, Sherlaw-Johnson, Bardsley, Blunt, Wood and Grigg , 2012 .

[41]  Abdullah Almasri,et al.  Was Rodney Ledward a statistical outlier? Retrospective analysis using routine hospital data to identify gynaecologists' performance , 2005, BMJ : British Medical Journal.

[42]  Divya Patel,et al.  An investigation into general practitioners associated with high patient mortality flagged up through the Shipman inquiry: retrospective analysis of routine data , 2004, BMJ : British Medical Journal.

[43]  R. Tennant,et al.  Monitoring patients using control charts: a systematic review. , 2007, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[44]  K. Walshe,et al.  Improvement capability and performance: a qualitative study of maternity services providers in the UK , 2018, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[45]  Rowena Jacobs,et al.  How Robust Are Hospital Ranks Based on Composite Performance Measures? , 2005, Medical care.

[46]  J. Gerring A case study , 2011, Technology and Society.

[47]  Giannis Karagiannis,et al.  On aggregate composite indicators , 2017, J. Oper. Res. Soc..

[48]  John Mingers,et al.  A critique of statistical modelling in management science from a critical realist perspective: its role within multimethodology , 2006, J. Oper. Res. Soc..

[49]  W. Savage The caesarean section epidemic , 2000, Journal of obstetrics and gynaecology : the journal of the Institute of Obstetrics and Gynaecology.

[50]  Clayton M. Christensen The Ongoing Process of Building a Theory of Disruption , 2006 .

[51]  Raquel Florez-Lopez,et al.  Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data , 2010 .

[52]  Snigdha Banerjee,et al.  Effect of declining selling price: profit analysis for a single period inventory model with stochastic demand and lead time , 2010, J. Oper. Res. Soc..

[53]  Léopold Simar,et al.  Efficiency and benchmarking with directional distances: a data-driven approach , 2016, J. Oper. Res. Soc..

[54]  Juan R. Trapero,et al.  On the identification of sales forecasting models in the presence of promotions , 2015, J. Oper. Res. Soc..

[55]  James F. Burgess Innovation and efficiency in health care: does anyone really know what they mean? , 2012 .

[56]  D J Spiegelhalter,et al.  Using routine intelligence to target inspection of healthcare providers in England , 2009, Quality & Safety in Health Care.

[57]  Ramanathan Gnanadesikan,et al.  Methods for statistical data analysis of multivariate observations , 1977, A Wiley publication in applied statistics.

[58]  B. Tabachnick,et al.  Using Multivariate Statistics , 1983 .

[59]  Shu-Cherng Fang,et al.  A kernel-free quadratic surface support vector machine for semi-supervised learning , 2016, J. Oper. Res. Soc..

[60]  Diana Adler,et al.  Using Multivariate Statistics , 2016 .

[61]  D. Demeritt,et al.  Intelligent Monitoring? Assessing the ability of the Care Quality Commission's statistical surveillance tool to predict quality and prioritise NHS hospital inspections , 2016, BMJ Quality & Safety.

[62]  Robert Fildes,et al.  Principles of Business Forecasting , 2012 .

[63]  Chris Tofallis,et al.  A better measure of relative prediction accuracy for model selection and model estimation , 2014, J. Oper. Res. Soc..

[64]  Joy Furnival,et al.  Quality improvement in healthcare , 2016 .