Application of multivariate probabilistic (Bayesian) networks to substance use disorder risk stratification and cost estimation.

INTRODUCTION This paper explores the use of machine learning and Bayesian classification models to develop broadly applicable risk stratification models to guide disease management of health plan enrollees with substance use disorder (SUD). While the high costs and morbidities associated with SUD are understood by payers, who manage it through utilization review, acute interventions, coverage and cost limitations, and disease management, the literature shows mixed results for these modalities in improving patient outcomes and controlling cost. Our objective is to evaluate the potential of data mining methods to identify novel risk factors for chronic disease and stratification of enrollee utilization, which can be used to develop new methods for targeting disease management services to maximize benefits to both enrollees and payers. METHODS For our evaluation, we used DecisionQ machine learning algorithms to build Bayesian network models of a representative sample of data licensed from Thomson-Reuters' MarketScan consisting of 185,322 enrollees with three full-year claim records. Data sets were prepared, and a stepwise learning process was used to train a series of Bayesian belief networks (BBNs). The BBNs were validated using a 10 percent holdout set. RESULTS The networks were highly predictive, with the risk-stratification BBNs producing area under the curve (AUC) for SUD positive of 0.948 (95 percent confidence interval [CI], 0.944-0.951) and 0.736 (95 percent CI, 0.721-0.752), respectively, and SUD negative of 0.951 (95 percent CI, 0.947-0.954) and 0.738 (95 percent CI, 0.727-0.750), respectively. The cost estimation models produced area under the curve ranging from 0.72 (95 percent CI, 0.708-0.731) to 0.961 (95 percent CI, 0.95-0.971). CONCLUSION We were able to successfully model a large, heterogeneous population of commercial enrollees, applying state-of-the-art machine learning technology to develop complex and accurate multivariate models that support near-real-time scoring of novel payer populations based on historic claims and diagnostic data. Initial validation results indicate that we can stratify enrollees with SUD diagnoses into different cost categories with a high degree of sensitivity and specificity, and the most challenging issue becomes one of policy. Due to the social stigma associated with the disease and ethical issues pertaining to access to care and individual versus societal benefit, a thoughtful dialogue needs to occur about the appropriate way to implement these technologies.

[1]  P. Valent,et al.  Cigarette smoke facilitates allergen penetration across respiratory epithelium , 2009, Allergy.

[2]  C. Weisner,et al.  Nine-Year Psychiatric Trajectories and Substance Use Outcomes , 2008, Evaluation review.

[3]  C. Horgan,et al.  Benefit Limits for Behavioral Health Care in Private Health Plans , 2008, Administration and Policy in Mental Health and Mental Health Services Research.

[4]  Jorge Moraleda,et al.  New algorithms, data structures, and user interfaces for machine learning of large datasets with applications , 2003 .

[5]  J. Anthony,et al.  Risk of Becoming Cocaine Dependent: Epidemiological Estimates for the United States, 2000–2001 , 2005, Neuropsychopharmacology.

[6]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[7]  D. Gagnon,et al.  Risk factors for central serous chorioretinopathy: a case-control study. , 2004, Ophthalmology.

[8]  Ross D. Shachter,et al.  Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: initial experience. , 2006, Radiology.

[9]  T. Babor Treatment for persons with substance use disorders: mediators, moderators, and the need for a new research approach , 2008, International journal of methods in psychiatric research.

[10]  M. Knuiman,et al.  Bayesian Approach to Predict Hospital Mortality of Intensive Care Readmissions during the Same Hospitalisation , 2008, Anaesthesia and intensive care.

[11]  D. Wisner,et al.  Methamphetamine use is associated with increased hospital resource consumption among minimally injured trauma patients. , 2009, The Journal of trauma.

[12]  Ross D. Shachter,et al.  A probabilistic expert system that provides automated mammographic-histologic correlation: initial experience. , 2004, AJR. American journal of roentgenology.

[13]  D. Madigan,et al.  Bayesian logistic injury severity score: a method for predicting mortality using international classification of disease-9 codes. , 2008, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[14]  A. Batra,et al.  Development and validation of a cluster‐based classification system to facilitate treatment tailoring , 2008, International journal of methods in psychiatric research.

[15]  R. Clark,et al.  Impact of substance disorders on medical expenditures for medicaid beneficiaries with behavioral health disorders. , 2009, Psychiatric services.

[16]  P. Barbini,et al.  A multivariate Bayesian model for assessing morbidity after coronary artery surgery , 2006, Critical care.

[17]  Jorge Moraleda,et al.  AD+Tree: A Compact Adaptation of Dynamic AD-Trees for Efficient Machine Learning on Large Data Sets , 2003, IDEAL.

[18]  The Case of H.S.: The Ethics of Reporting Alcohol Dependence in a Bus Driver , 2007, International journal of psychiatry in medicine.

[19]  J. Bergeron,et al.  Therapeutic alliance, patient behaviour and dropout in a drug rehabilitation programme: the moderating effect of clinical subpopulations. , 2007, Addiction.

[20]  R. Saitz,et al.  The Case for Chronic Disease Management for Addiction , 2008, Journal of addiction medicine.

[21]  D. Gastfriend,et al.  No-Show for Treatment in Substance Abuse Patients with Comorbid Symptomatology: Validity Results from a Controlled Trial of the ASAM Patient Placement Criteria , 2007, Journal of addiction medicine.

[22]  H. Holder,et al.  The reduction of health care costs associated with alcoholism treatment: a 14-year longitudinal study. , 1992, Journal of studies on alcohol.

[23]  P. O'Connor,et al.  Cost analysis of clinic and office-based treatment of opioid dependence: results with methadone and buprenorphine in clinically stable patients. , 2009, Drug and Alcohol Dependence.

[24]  Tailored treatment in the outpatient substance abuse treatment sector: 1995-2005. , 2008, Journal of substance abuse treatment.

[25]  Lakhmi C. Jain,et al.  Introduction to Bayesian Networks , 2008 .

[26]  W. van den Brink,et al.  Allocation of substance use disorder patients to appropriate levels of care: feasibility of matching guidelines in routine practice in Dutch treatment centres. , 2007, Addiction.

[27]  D. Satre,et al.  Short-term alcohol and drug treatment outcomes predict long-term outcome. , 2003, Drug and alcohol dependence.

[28]  G. Bühringer Allocating treatment options to patient profiles: clinical art or science? , 2006, Addiction.

[29]  M. Rotondo,et al.  Alcohol withdrawal syndrome: Turning minor injuries into a major problem. , 2004, The Journal of trauma.