Covid-19 risk factors: statistical learning from German healthcare claims data

Abstract Background Precise individual risk quantification of severe courses of Covid-19 is needed to prioritize protective measures and to assess population risks in a phase of increased immunization. So far, results for the German population are lacking. Furthermore, existing studies pre-specify comorbidity risks by broad categories rather than deriving them from the data using statistical learning techniques. Methods Risk factors for severe, critical and lethal courses of Covid-19 are identified from a large German claims dataset covering more than 4 million individuals. To avoid prior grouping and pre-selection of risk factors, fine-grained hierarchical information from medical classification systems for diagnoses, pharmaceuticals and procedures are used, resulting in more than 33,000 covariates. These are processed using a LASSO approach. Results We identify relevant risk factors, among which hypertensive diseases, heart disease and the corresponding medications are most relevant at population level. Prior use of diuretics is the strongest single medical predictor for severe course (e.g. Torasemide, odds ratio (OR) 1.801), but also for a critical course (OR 2.304) and death (OR 2.523). To assess risk profiles at the individual level, our approach sums up many such factors and has better predictive ability than using pre-specified morbidity groups (AUC for predicting critical course 0.875 versus AUC ≤ 0.865). Conclusions The proposed method can help to identify risk factors and assess risk at the individual level for other infectious diseases. The results can be used by administrative data holders to guide protective policies, while a risk index can be applied in clinical studies with a narrower focus.

[1]  A. Papageorghiou,et al.  Maternal and Neonatal Morbidity and Mortality Among Pregnant Women With and Without COVID-19 Infection , 2021, JAMA pediatrics.

[2]  K. Bhaskaran,et al.  Factors associated with COVID-19-related death using OpenSAFELY , 2020, Nature.

[3]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[4]  C. A. Shaw,et al.  Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score , 2020, BMJ.

[5]  Paul McKeigue,et al.  Quantifying performance of a diagnostic test as the expected information for discrimination: Relation to the C-statistic , 2018, Statistical methods in medical research.

[6]  R. Busse,et al.  Case characteristics, resource use, and outcomes of 10 021 patients with COVID-19 admitted to 920 German hospitals: an observational study , 2020, The Lancet Respiratory Medicine.

[7]  K. Bhaskaran,et al.  HIV infection and COVID-19 death: population-based cohort analysis of UK primary care data and linked national death registrations within the OpenSAFELY platform , 2020, medRxiv.

[8]  J. Stausberg,et al.  New Morbidity and Comorbidity Scores based on the Structure of the ICD-10 , 2015, PloS one.

[9]  O. Wichmann,et al.  Pre-existing health conditions and severe COVID-19 outcomes: an umbrella review approach and meta-analysis of global evidence , 2021, BMC Medicine.

[10]  A. Sheikh,et al.  Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study , 2020, BMJ.

[11]  G. Lip,et al.  Comorbidities associated with mortality in 31,461 adults with COVID-19 in the United States: A federated electronic medical record analysis , 2020, PLoS medicine.

[12]  K. Khunti,et al.  Associations of type 1 and type 2 diabetes with COVID-19-related mortality in England: a whole-population study , 2020, The Lancet Diabetes & Endocrinology.

[13]  Rapid Epidemiological Analysis of Comorbidities and Treatments as risk factors for COVID-19 in Scotland (REACT-SCOT): A population-based case-control study , 2020, PLoS medicine.