Risk Factors and Predictive Modeling for Post-Acute Sequelae of SARS-CoV-2 Infection: Findings from EHR Cohorts of the RECOVER Initiative

Background Patients who were SARS-CoV-2 infected could suffer from newly incidental conditions in their post-acute infection period. These conditions, denoted as the post-acute sequelae of SARS-CoV-2 infection (PASC), are highly heterogeneous and involve a diverse set of organ systems. Limited studies have investigated the predictability of these conditions and their associated risk factors. Method In this retrospective cohort study, we investigated two large-scale PCORnet clinical research networks, INSIGHT and OneFlorida+, including 11 million patients in the New York City area and 16.8 million patients from Florida, to develop machine learning prediction models for those who are at risk for newly incident PASC and to identify factors associated with newly incident PASC conditions. Adult patients aged 20 with SARS-CoV-2 infection and without recorded infection between March 1st, 2020, and November 30th, 2021, were used for identifying associated factors with incident PASC after removing background associations. The predictive models were developed on infected adults. Results We find several incident PASC, e.g., malnutrition, COPD, dementia, and acute kidney failure, were associated with severe acute SARS-CoV-2 infection, defined by hospitalization and ICU stay. Older age and extremes of weight were also associated with these incident conditions. These conditions were better predicted (C-index >0.8). Moderately predictable conditions included diabetes and thromboembolic disease (C-index 0.7–0.8). These were associated with a wider variety of baseline conditions. Less predictable conditions included fatigue, anxiety, sleep disorders, and depression (C-index around 0.6). Conclusions This observational study suggests that a set of likely risk factors for different PASC conditions were identifiable from EHRs, predictability of different PASC conditions was heterogeneous, and using machine learning-based predictive models might help in identifying patients who were at risk of developing incident PASC.

[1]  D. Khullar,et al.  Data-driven identification of post-acute SARS-CoV-2 infection subphenotypes , 2022, Nature Medicine.

[2]  C. Steves,et al.  Risk of long COVID associated with delta versus omicron variants of SARS-CoV-2 , 2022, The Lancet.

[3]  D. Khullar,et al.  Machine Learning for Identifying Data-Driven Subphenotypes of Incident Post-Acute SARS-CoV-2 Infection Conditions with Large Scale Electronic Health Records: Findings from the RECOVER Initiative , 2022, medRxiv.

[4]  E. Schenck,et al.  Understanding Post-Acute Sequelae of SARS-CoV-2 Infection through Data-Driven Analysis with the Longitudinal Electronic Health Records: Findings from the RECOVER Initiative , 2022, medRxiv.

[5]  K. Gersing,et al.  Identifying who has long COVID in the USA: a machine learning approach using N3C data , 2022, The Lancet Digital Health.

[6]  F. Moy,et al.  Long COVID and its associated factors among COVID survivors in the community from a middle-income country—An online cross-sectional study , 2022, medRxiv.

[7]  Yan Xie,et al.  Risks of mental health outcomes in people with covid-19: cohort study , 2022, BMJ.

[8]  Kevin N. Heath,et al.  Risk of persistent and new clinical sequelae among adults aged 65 years and older during the post-acute phase of SARS-CoV-2 infection: retrospective cohort study , 2022, BMJ.

[9]  Benjamin Bowe,et al.  Long-term cardiovascular outcomes of COVID-19 , 2022, Nature Medicine.

[10]  Inyoul Y. Lee,et al.  Multiple early factors anticipate post-acute COVID-19 sequelae , 2022, Cell.

[11]  Z. Al-Aly,et al.  Burdens of post-acute sequelae of COVID-19 by severity of acute infection, demographics and health status , 2021, Nature Communications.

[12]  A. Akbari,et al.  Risk Factors Associated with Long COVID Syndrome: A Retrospective Study , 2021, Iranian journal of medical sciences.

[13]  P. Edison,et al.  Long covid—mechanisms, risk factors, and management , 2021, BMJ.

[14]  A. Aminian,et al.  Association of obesity with postacute sequelae of COVID‐19 , 2021, Diabetes, obesity & metabolism.

[15]  D. Brodie,et al.  Post-acute COVID-19 syndrome , 2021, Nature Medicine.

[16]  David A. Drew,et al.  Attributes and predictors of long COVID , 2021, Nature Medicine.

[17]  Benjamin Bowe,et al.  High-dimensional characterization of post-acute sequelae of COVID-19 , 2021, Nature.

[18]  M. Andrés,et al.  Post-acute COVID-19 syndrome. Incidence and risk factors: A Mediterranean cohort study , 2021, Journal of Infection.

[19]  A. Gefen,et al.  COVID-19: pressure ulcers, pain and the cytokine storm. , 2020, Journal of wound care.

[20]  K. Bhaskaran,et al.  Factors associated with COVID-19-related death using OpenSAFELY , 2020, Nature.

[21]  R. Trimble COVID-19 Dashboard , 2020 .

[22]  William R. Buckingham,et al.  Making Neighborhood-Disadvantage Metrics Accessible - The Neighborhood Atlas. , 2018, The New England journal of medicine.

[23]  E. Shenkman,et al.  OneFlorida Clinical Research Consortium: Linking a Clinical and Translational Science Institute With a Community-Based Distributive Medical Education Model , 2017, Academic medicine : journal of the Association of American Medical Colleges.

[24]  Jack Cuzick,et al.  Use of the concordance index for predictors of censored survival data , 2016, Statistical methods in medical research.

[25]  Rainu Kaushal,et al.  Changing the research landscape: the New York City Clinical Data Research Network , 2014, J. Am. Medical Informatics Assoc..

[26]  Richard Platt,et al.  Launching PCORnet, a national patient-centered clinical research network , 2014, Journal of the American Medical Informatics Association : JAMIA.

[27]  M. Horányi [Anemia in pregnancy]. , 1970, Orvosi hetilap.