Development and Validation of a Predictive Model to Identify Individuals Likely to Have Undiagnosed Chronic Obstructive Pulmonary Disease Using an Administrative Claims Database.

BACKGROUND Despite the importance of early detection, delayed diagnosis of chronic obstructive pulmonary disease (COPD) is relatively common. Approximately 12 million people in the United States have undiagnosed COPD. Diagnosis of COPD is essential for the timely implementation of interventions, such as smoking cessation programs, drug therapies, and pulmonary rehabilitation, which are aimed at improving outcomes and slowing disease progression. OBJECTIVE To develop and validate a predictive model to identify patients likely to have undiagnosed COPD using administrative claims data. METHODS A predictive model was developed and validated utilizing a retro-spective cohort of patients with and without a COPD diagnosis (cases and controls), aged 40-89, with a minimum of 24 months of continuous health plan enrollment (Medicare Advantage Prescription Drug [MAPD] and commercial plans), and identified between January 1, 2009, and December 31, 2012, using Humana's claims database. Stratified random sampling based on plan type (commercial or MAPD) and index year was performed to ensure that cases and controls had a similar distribution of these variables. Cases and controls were compared to identify demographic, clinical, and health care resource utilization (HCRU) characteristics associated with a COPD diagnosis. Stepwise logistic regression (SLR), neural networking, and decision trees were used to develop a series of models. The models were trained, validated, and tested on randomly partitioned subsets of the sample (Training, Validation, and Test data subsets). Measures used to evaluate and compare the models included area under the curve (AUC); index of the receiver operating characteristics (ROC) curve; sensitivity, specificity, positive predictive value (PPV); and negative predictive value (NPV). The optimal model was selected based on AUC index on the Test data subset. RESULTS A total of 50,880 cases and 50,880 controls were included, with MAPD patients comprising 92% of the study population. Compared with controls, cases had a statistically significantly higher comorbidity burden and HCRU (including hospitalizations, emergency room visits, and medical procedures). The optimal predictive model was generated using SLR, which included 34 variables that were statistically significantly associated with a COPD diagnosis. After adjusting for covariates, anticholinergic bronchodilators (OR = 3.336) and tobacco cessation counseling (OR = 2.871) were found to have a large influence on the model. The final predictive model had an AUC of 0.754, sensitivity of 60%, specificity of 78%, PPV of 73%, and an NPV of 66%. CONCLUSIONS This claims-based predictive model provides an acceptable level of accuracy in identifying patients likely to have undiagnosed COPD in a large national health plan. Identification of patients with undiagnosed COPD may enable timely management and lead to improved health outcomes and reduced COPD-related health care expenditures.

[1]  D. Mannino,et al.  Prevalence and outcomes of diabetes, hypertension and cardiovascular disease in COPD , 2008, European Respiratory Journal.

[2]  A. Lindberg,et al.  Prevalence and underdiagnosis of COPD by disease severity and the attributable fraction of smoking Report from the Obstructive Lung Disease in Northern Sweden Studies. , 2006, Respiratory medicine.

[3]  D. Mannino COPD: epidemiology, prevalence, morbidity and mortality, and disease heterogeneity. , 2002, Chest.

[4]  J. LaFountain Inc. , 2013, American Art.

[5]  E. Ford,et al.  Total and state-specific medical and absenteeism costs of COPD among adults aged ≥ 18 years in the United States for 2010 and projections through 2020. , 2015, Chest.

[6]  M. Stafoggia,et al.  [Definition and validation of a predictive model to identify patients with chronic obstructive pulmonary disease (COPD) from administrative databases]. , 2012, Epidemiologia e prevenzione.

[7]  Margrethe Smidth,et al.  Developing an algorithm to identify people with Chronic Obstructive Pulmonary Disease (COPD) using administrative data , 2012, BMC Medical Informatics and Decision Making.

[8]  D. Mannino,et al.  Global burden of COPD: risk factors, prevalence, and future trends , 2007, The Lancet.

[9]  D. Price,et al.  Opportunities to diagnose chronic obstructive pulmonary disease in routine care in the UK: a retrospective study of a clinical cohort. , 2014, The Lancet. Respiratory medicine.

[10]  J. Walters,et al.  A mixed methods study to compare models of spirometry delivery in primary care for patients at risk of COPD , 2007, Thorax.

[11]  Gordon H. Guyatt,et al.  Prevalence and underdiagnosis of chronic obstructive pulmonary disease among patients at risk in primary care , 2010, Canadian Medical Association Journal.

[12]  A. Azevedo,et al.  Coexisting chronic obstructive pulmonary disease and heart failure: implications for treatment, course and mortality , 2010, Current opinion in pulmonary medicine.

[13]  J. Farley,et al.  A comparison of comorbidity measurements to predict healthcare expenditures. , 2006, The American journal of managed care.

[14]  D. Mannino,et al.  The natural history of chronic airflow obstruction revisited: an analysis of the Framingham offspring cohort. , 2009, American journal of respiratory and critical care medicine.

[15]  Anne E Sales,et al.  Construction and Characteristics of the RxRisk-V: A VA-Adapted Pharmacy-Based Case-mix Instrument , 2003, Medical care.

[16]  Barry De Ville,et al.  Decision Trees for Business Intelligence and Data Mining: Using SAS Enterprise Miner , 2006 .

[17]  Rachel Booker Rgn,et al.  Chronic obstructive pulmonary disease , 1998 .

[18]  Randall Matignon Neural Network Modeling using SAS Enterprise Miner , 2005 .

[19]  S. Aaron,et al.  Spirometry in the primary care setting: influence on clinical diagnosis and management of airflow obstruction. , 2006, Chest.

[20]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[21]  Randall Matignon Data Mining Using SAS® Enterprise Miner™: Matignon/Data Mining , 2007 .

[22]  F. Frost,et al.  Can outpatient pharmacy data identify persons with undiagnosed COPD? , 2010, The American journal of managed care.

[23]  T. Seemungal,et al.  Early therapy improves outcomes of exacerbations of chronic obstructive pulmonary disease. , 2004, American journal of respiratory and critical care medicine.

[24]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[25]  Randall Matignon,et al.  Data Mining Using SAS Enterprise Miner , 2007 .

[26]  Anand A. Dalal,et al.  Burden of COPD in a government health care system: a retrospective observational study using data from the US Veterans Affairs population , 2010, International journal of chronic obstructive pulmonary disease.

[27]  F. Frost,et al.  An Algorithm for the Identification of Undiagnosed COPD Cases Using Administrative Claims Data , 2006 .

[28]  Anne E Sales,et al.  Predicting Costs of Care Using a Pharmacy-Based Measure Risk Adjustment in a Veteran Population , 2003, Medical care.

[29]  C. D. Mathers,et al.  Chronic obstructive pulmonary disease: current burden and future projections , 2006, European Respiratory Journal.

[30]  Predictors of lung function and its decline in mild to moderate COPD in association with gender : Results from the Euroscop study , 2006 .

[31]  D. Mannino,et al.  Chronic obstructive pulmonary disease surveillance--United States, 1971-2000. , 2002, Morbidity and mortality weekly report. Surveillance summaries.

[32]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[33]  R. Deyo,et al.  Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. , 1992, Journal of clinical epidemiology.

[34]  K. S. Sarma,et al.  Predictive Modeling With SAS Enterprise Miner: Practical Solutions for Business Applications , 2007 .

[35]  J L Warren,et al.  Development of a comorbidity index using physician claims data. , 2000, Journal of clinical epidemiology.

[36]  Paul A. Fishman,et al.  Risk Adjustment Using Automated Ambulatory Pharmacy Data: The RxRisk Model , 2003, Medical care.

[37]  D. Postma,et al.  Chronic obstructive pulmonary disease. , 2002, Clinical evidence.

[38]  Martin Bland,et al.  An Introduction to Medical Statistics , 1987 .

[39]  B. Make,et al.  Identifying and characterizing COPD patients in US managed care. A retrospective, cross-sectional analysis of administrative claims data , 2011, BMC health services research.