Loss Function Based Ranking in Two-Stage, Hierarchical Models.

Performance evaluations of health services providers burgeons. Similarly, analyzing spatially related health information, ranking teachers and schools, and identification of differentially expressed genes are increasing in prevalence and importance. Goals include valid and efficient ranking of units for profiling and league tables, identification of excellent and poor performers, the most differentially expressed genes, and determining "exceedances" (how many and which unit-specific true parameters exceed a threshold). These data and inferential goals require a hierarchical, Bayesian model that accounts for nesting relations and identifies both population values and random effects for unit-specific parameters. Furthermore, the Bayesian approach coupled with optimizing a loss function provides a framework for computing non-standard inferences such as ranks and histograms.Estimated ranks that minimize Squared Error Loss (SEL) between the true and estimated ranks have been investigated. The posterior mean ranks minimize SEL and are "general purpose," relevant to a broad spectrum of ranking goals. However, other loss functions and optimizing ranks that are tuned to application-specific goals require identification and evaluation. For example, when the goal is to identify the relatively good (e.g., in the upper 10%) or relatively poor performers, a loss function that penalizes classification errors produces estimates that minimize the error rate. We construct loss functions that address this and other goals, developing a unified framework that facilitates generating candidate estimates, comparing approaches and producing data analytic performance summaries. We compare performance for a fully parametric, hierarchical model with Gaussian sampling distribution under Gaussian and a mixture of Gaussians prior distributions. We illustrate approaches via analysis of standardized mortality ratio data from the United States Renal Data System.Results show that SEL-optimal ranks perform well over a broad class of loss functions but can be improved upon when classifying units above or below a percentile cut-point. Importantly, even optimal rank estimates can perform poorly in many real-world settings; therefore, data-analytic performance summaries should always be reported.

[1]  A. Rukhin Bayes and Empirical Bayes Methods for Data Analysis , 1997 .

[2]  Sharon-Lise T. Normand,et al.  Analytic Methods for Constructing Cross-Sectional Profiles of Health Care Providers , 2000, Health Services and Outcomes Research Methodology.

[3]  George H. Noell,et al.  Value-Added Assessment of Teacher Preparation , 2006 .

[4]  Donald B. Rubin,et al.  A Potential Outcomes View of Value-Added Assessment in Education , 2004 .

[5]  William DuMouchel,et al.  Bayesian Data Mining in Large Frequency Tables, with an Application to the FDA Spontaneous Reporting System , 1999 .

[6]  Wei Pan,et al.  Methods for Estimating and Interpreting Provider-Specific Standardized Mortality Ratios , 2003, Health Services and Outcomes Research Methodology.

[7]  T. Louis,et al.  Triple‐goal estimates in two‐stage hierarchical models , 1998 .

[8]  Esrd Clinical Performance Measures 2002 annual report: ESRD Clinical Performance Measures Project. , 2003 .

[9]  P Kemper,et al.  The design of the community tracking study: a longitudinal study of health system change and its effects on people. , 1996, Inquiry : a journal of medical care organization, provision and financing.

[10]  W. Owen,et al.  Limitations of the facility-specific standardized mortality ratio for profiling health care quality in dialysis. , 2001, American journal of kidney diseases : the official journal of the National Kidney Foundation.

[11]  Michael J Daniels,et al.  Longitudinal profiling of health care units based on continuous and discrete patient outcomes. , 2005, Biostatistics.

[12]  Thomas A. Louis,et al.  Flexible distributions for triple-goal estimates in two-stage hierarchical models , 2006, Comput. Stat. Data Anal..

[13]  Mark McClellan,et al.  The Quality of Health Care Providers , 1999 .

[14]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[15]  Peter C. Austin,et al.  Comparing clinical data with administrative data for producing acute myocardial infarction report cards , 2006 .

[16]  M. Glickman,et al.  Statistical Methods for Profiling Providers of Medical Care: Issues and Applications , 1997 .

[17]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  D S Gaylin,et al.  Using USRDS generated mortality tables to compare local ESRD mortality rates to national rates. , 1992, Kidney international.

[19]  Thomas A Louis,et al.  Jump down to Document , 2022 .

[20]  Thomas A. Louis,et al.  League tables and their limitations: Statistical issues in comparisons of institutional performance - Discussion , 1996 .

[21]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[22]  John D. Storey A direct approach to false discovery rates , 2002 .

[23]  M. Escobar Estimating Normal Means with a Dirichlet Process Prior , 1994 .

[24]  W Shen,et al.  Innovations in bayes and empirical bayes methods: estimating parameters, populations and ranks. , 1999, Statistics in medicine.

[25]  Thomas A. Louis,et al.  Empirical Bayes Ranking Methods , 1989 .

[26]  T. Louis,et al.  A constrained empirical Bayes estimator for incidence rates in areas with small populations. , 1994, Statistics in medicine.

[27]  Sharon-Lise T. Normand,et al.  Selection of Related Multivariate Means , 2003 .

[28]  T. Louis,et al.  Empirical bayes estimators for spatially correlated incidence rates , 1994 .

[29]  C L Christiansen,et al.  Improving the Statistical Approach to Health Care Provider Profiling , 1997, Annals of Internal Medicine.

[30]  Peter J. Diggle,et al.  Spatial modelling and the prediction of Loa loa risk: decision making under uncertainty. , 2007 .

[31]  Thomas A. Louis,et al.  Ranking USRDS provider specific SMRs from 1998–2001 , 2008, Health Services and Outcomes Research Methodology.

[32]  N. Cressie,et al.  Loss functions for estimation of extrema with an application to disease mapping , 2003 .

[33]  P J Diggle,et al.  Spatial modelling and the prediction of Loa loa risk: decision making under uncertainty , 2007, Annals of tropical medicine and parasitology.

[34]  A. Gelman,et al.  All maps of parameter estimates are misleading. , 1999, Statistics in medicine.

[35]  Simon Day,et al.  Statistical analysis of performance indicators in UK higher education , 2005 .

[36]  T. Louis,et al.  Bayes and Empirical Bayes Methods for Data Analysis. , 1997 .

[37]  Harvey Goldstein,et al.  League Tables and Their Limitations: Statistical Issues in Comparisons of Institutional Performance , 1996 .

[38]  Thomas A Louis,et al.  Uncertainty in Rank Estimation: Implications for Value-Added Modeling Accountability Systems , 2002, Journal of educational and behavioral statistics : a quarterly publication sponsored by the American Educational Research Association and the American Statistical Association.

[39]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[40]  James Algina,et al.  An Empirical Comparison of Statistical Models for Value-Added Assessment of School Performance , 2004 .