Understanding COVID-19 trajectories from a nationwide linked electronic health record cohort of 56 million people: phenotypes, severity, waves & vaccination

Background: Updatable understanding of the onset and progression of individuals COVID-19 trajectories underpins pandemic mitigation efforts. In order to identify and characterize individual trajectories, we defined and validated ten COVID-19 phenotypes from linked electronic health records (EHR) on a nationwide scale using an extensible framework. Methods: Cohort study of 56.6 million people in England alive on 23/01/2020, followed until 31/05/2021, using eight linked national datasets spanning COVID-19 testing, vaccination, primary & secondary care and death registrations data. We defined ten COVID-19 phenotypes reflecting clinically relevant stages of disease severity using a combination of international clinical terminologies (e.g. SNOMED-CT, ICD-10) and bespoke data fields; positive test, primary care diagnosis, hospitalisation, critical care (four phenotypes), and death (three phenotypes). Using these phenotypes, we constructed patient trajectories illustrating the transition frequency and duration between phenotypes. Analyses were stratified by pandemic waves and vaccination status. Findings: We identified 3,469,528 infected individuals (6.1%) with 8,825,738 recorded COVID-19 phenotypes. Of these, 364,260 (11%) were hospitalised and 140,908 (4%) died. Of those hospitalised, 38,072 (10%) were admitted to intensive care (ICU), 54,026 (15%) received non-invasive ventilation and 21,404 (6%) invasive ventilation. Amongst hospitalised patients, first wave mortality (30%) was higher than the second (23%) in non-ICU settings, but remained unchanged for ICU patients. The highest mortality was for patients receiving critical care outside of ICU in wave 1 (51%). 13,083 (9%) COVID-19 related deaths occurred without diagnoses on the death certificate, but within 30 days of a positive test while 10,403 (7%) of cases were identified from mortality data alone with no prior phenotypes recorded. We observed longer patient trajectories in the second pandemic wave compared to the first. Interpretation: Our analyses illustrate the wide spectrum of severity that COVID-19 displays and significant differences in incidence, survival and pathways across pandemic waves. We provide an adaptable framework to answer questions of clinical and policy relevance; new variant impact, booster dose efficacy and a way of maximising existing data to understand individuals progression through disease states.

[1]  Spiros C. Denaxas,et al.  Evaluation of antithrombotic use and COVID-19 outcomes in a nationwide atrial fibrillation cohort , 2021, Heart.

[2]  T. Wigmore,et al.  Critical care transfers and COVID-19: Managing capacity challenges through critical care networks , 2020, Journal of the Intensive Care Society.

[3]  Jonathan A. Cooper,et al.  Association of COVID-19 with arterial and venous vascular diseases: a population-wide cohort study of 48 million adults in England and Wales , 2021, medRxiv.

[4]  Spiros C. Denaxas,et al.  Association of COVID-19 vaccines ChAdOx1 and BNT162b2 with major venous, arterial, and thrombocytopenic events: whole population cohort study in 46 million adults in England , 2021, medRxiv.

[5]  L. Smeeth,et al.  Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY , 2021, medRxiv.

[6]  K. Bhaskaran,et al.  Ethnic differences in SARS-CoV-2 infection and COVID-19-related hospitalisation, intensive care unit admission, and death in 17 million adults in England: an observational cohort study using the OpenSAFELY platform , 2021, The Lancet.

[7]  Spiros C. Denaxas,et al.  Linked electronic health records for research on a nationwide cohort of more than 54 million people in England: data resource , 2021, BMJ.

[8]  Bobak J. Mortazavi,et al.  Accuracy of Computable Phenotyping Approaches for SARS-CoV-2 Infection and COVID-19 Hospitalizations from the Electronic Health Record , 2021, medRxiv.

[9]  M. Hernán,et al.  BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination Setting , 2021, The New England journal of medicine.

[10]  Jeffrey G. Klann,et al.  Validation of an Internationally Derived Patient Severity Phenotype to Support COVID-19 Analytics from Electronic Health Record Data , 2021, J. Am. Medical Informatics Assoc..

[11]  W. Lim,et al.  Dexamethasone in Hospitalized Patients with Covid-19 , 2021 .

[12]  M. Shankar-Hari,et al.  COVID-19 in critical care: epidemiology of the first epidemic wave across England, Wales and Northern Ireland , 2020, Intensive Care Medicine.

[13]  P. Horby,et al.  Features of 20 133 UK patients in hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: prospective observational cohort study , 2020, BMJ.

[14]  G. Barlow,et al.  Novel coronavirus disease (Covid-19): The first two patients in the UK with person to person transmission , 2020, Journal of Infection.

[15]  Arturo Gonzalez-Izquierdo,et al.  UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER , 2019, J. Am. Medical Informatics Assoc..

[16]  Spiros C. Denaxas,et al.  A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service , 2019, The Lancet. Digital health.

[17]  Communities English Indices of Deprivation 2010 , 2015 .

[18]  Matthias Cavassini,et al.  [Infectious diseases]. , 2014, Revue medicale suisse.