Predicting patient ‘cost blooms’ in Denmark: a longitudinal population-based study

Objectives To compare the ability of standard versus enhanced models to predict future high-cost patients, especially those who move from a lower to the upper decile of per capita healthcare expenditures within 1 year—that is, ‘cost bloomers’. Design We developed alternative models to predict being in the upper decile of healthcare expenditures in year 2 of a sample, based on data from year 1. Our 6 alternative models ranged from a standard cost-prediction model with 4 variables (ie, traditional model features), to our largest enhanced model with 1053 non-traditional model features. To quantify any increases in predictive power that enhanced models achieved over standard tools, we compared the prospective predictive performance of each model. Participants and Setting We used the population of Western Denmark between 2004 and 2011 (2 146 801 individuals) to predict future high-cost patients and characterise high-cost patient subgroups. Using the most recent 2-year period (2010–2011) for model evaluation, our whole-population model used a cohort of 1 557 950 individuals with a full year of active residency in year 1 (2010). Our cost-bloom model excluded the 155 795 individuals who were already high cost at the population level in year 1, resulting in 1 402 155 individuals for prediction of cost bloomers in year 2 (2011). Primary outcome measures Using unseen data from a future year, we evaluated each model's prospective predictive performance by calculating the ratio of predicted high-cost patient expenditures to the actual high-cost patient expenditures in Year 2—that is, cost capture. Results Our best enhanced model achieved a 21% and 30% improvement in cost capture over a standard diagnosis-based model for predicting population-level high-cost patients and cost bloomers, respectively. Conclusions In combination with modern statistical learning methods for analysing large data sets, models enhanced with a large and diverse set of features led to better performance—especially for predicting future cost bloomers.

[1]  P. Mortensen,et al.  The Danish Civil Registration System. A cohort of eight million persons. , 2006, Danish medical bulletin.

[2]  Jonathan Taylor,et al.  Statistical learning and selective inference , 2015, Proceedings of the National Academy of Sciences.

[3]  R. Dorizzi,et al.  The Healthcare Imperative: lowering costs and improving outcomes: ancora una volta l'Institute of Medicine traccia la via , 2009 .

[4]  J. Fleishman,et al.  Using information on clinical conditions to predict high-cost patients. , 2010, Health services research.

[5]  Joachim Roski,et al.  Creating value in health care through big data: opportunities and policy implications. , 2014, Health affairs.

[6]  Maureen Bisognano,et al.  THE COMMONWEALTH FUND Commission on a High Performance Health System , 2005 .

[7]  S. Asthana,et al.  Setting health care capitations through diagnosis-based risk adjustment: a suitable model for the English NHS? , 2011, Health policy.

[8]  Karen B DeSalvo,et al.  Predicting mortality and healthcare utilization with a single question. , 2001, Health services research.

[9]  Sai T. Moturu,et al.  Predictive risk modelling for forecasting high-cost patients: a real-world application using Medicaid data , 2010 .

[10]  Santosh S. Vempala,et al.  Algorithmic Prediction of Health-Care Costs , 2008, Oper. Res..

[11]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[12]  Robert J. Stroebel,et al.  Risk-stratification methods for identifying patients for care coordination. , 2013, The American journal of managed care.

[13]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[14]  S. Cohen The concentration and persistence in the level of health expenditures over time: Estimates for the U.S. population, 2006-2007 , 2010 .

[15]  Eric Schone,et al.  Risk Adjustment What is the Current State of the Art and How Can it Be Improved , 2013 .

[16]  Sherri Rose,et al.  A Machine Learning Framework for Plan Payment Risk Adjustment. , 2016, Health services research.

[17]  Ives Cavalcante Passos,et al.  Big data analytics and machine learning: 2015 and beyond. , 2016, The lancet. Psychiatry.

[18]  Henrik Toft Sørensen,et al.  The Danish Civil Registration System as a tool in epidemiology , 2014, European Journal of Epidemiology.

[19]  J. Feder Predictive modeling and team care for high-need patients at HealthCare Partners. , 2011, Health affairs.

[20]  Henrik Toft Sørensen,et al.  Existing data sources for clinical epidemiology: The Danish National Database of Reimbursed Prescriptions , 2012, Clinical epidemiology.

[21]  Arlene S Ash,et al.  Predicting Pharmacy Costs and Other Medical Costs Using Diagnoses and Drug Claims , 2005, Medical care.

[22]  Laura R Wherry,et al.  Using self-reported health measures to predict high-need cases among Medicaid-eligible adults. , 2014, Health services research.

[23]  Elsebeth Lynge,et al.  The Danish National Patient Register , 2011, Scandinavian journal of public health.

[24]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[25]  K. Borgwardt,et al.  Machine Learning in Medicine , 2015, Mach. Learn. under Resour. Constraints Vol. 3.

[26]  D. Bates,et al.  Big data in health care: using analytics to identify and manage high-risk and high-cost patients. , 2014, Health affairs.

[27]  C. Pedersen,et al.  The Danish Civil Registration System , 2011, Scandinavian journal of public health.

[28]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[29]  A. Ash,et al.  Finding future high-cost cases: comparing prior cost versus diagnosis-based methods. , 2001, Health services research.

[30]  A. Bierman,et al.  The future of capitation The physician role in managing change in practice , 2001, Journal of General Internal Medicine.

[31]  E John Orav,et al.  Contribution of preventable acute care spending to total spending for high-cost Medicare patients. , 2013, JAMA.

[32]  Paul A. Fishman,et al.  Using Risk-Adjustment Models to Identify High-Cost Risks , 2003, Medical care.

[33]  T. Ferris,et al.  Caring for high-need, high-cost patients: what makes for a successful care management program? , 2014, Issue brief.

[34]  Vincent G. Iannacchione,et al.  High-cost users of medical care , 1988, Health care financing review.

[35]  Douglas McCarthy,et al.  Models of Care for High-Need, High-Cost Patients: An Evidence Synthesis. , 2015, Issue brief.

[36]  Margaret K. Saunders In Denmark, big data goes to work. , 2014, Health affairs.

[37]  C. Steiner,et al.  Comorbidity measures for use with administrative data. , 1998, Medical care.

[38]  Alison M Darcy,et al.  Machine Learning and the Profession of Medicine. , 2016, JAMA.