Reducing diagnostic delays in Acute Hepatic Porphyria using electronic health records data and machine learning: a multicenter development and validation study

Importance Acute Hepatic Porphyria (AHP) is a group of rare but treatable conditions associated with diagnostic delays of fifteen years on average. The advent of electronic health records (EHR) data and machine learning (ML) may help improve the timely recognition of rare diseases like AHP. However, prediction models can be difficult to train given the limited case numbers, unstructured EHR data, and selection biases intrinsic to healthcare delivery. Objective To train and characterize models for identifying patients with AHP. Design, Setting, and Participants This diagnostic study used structured and notes-based EHR data from two centers at the University of California, UCSF (2012-2022) and UCLA (2019-2022). The data were split into two cohorts (referral, diagnosis) and used to develop models that predict: 1) who will be referred for testing, amongst those who presented with abdominal pain (a cardinal symptom of AHP), and 2) who will test positive, amongst those referred. The referral cohort consisted of 747 patients referred for testing and 99,849 contemporaneous patients who were not. The diagnosis cohort consisted of 72 confirmed AHP cases and 347 patients who tested negative. Cases were female predominant and 6-75 years old at the time of diagnosis. Candidate models used a range of architectures. Feature selection was semi-automated and incorporated publicly available data from knowledge graphs. Main Outcomes and Measures F-score on an outcome-stratified test set Results The best center-specific referral models achieved an F-score of 86-91%. The best diagnosis model achieved an F-score of 92%. To further test our model, we contacted 372 current patients who lack an AHP diagnosis but were predicted by our models as potentially having it ([≥] 10% probability of referral, [≥] 50% of testing positive). However, we were only able to recruit 10 of these patients for biochemical testing, all of whom were negative. Nonetheless, post hoc evaluations suggested that these models could identify 71% of cases earlier than their diagnosis date, saving 1.2 years. Conclusions and Relevance ML can reduce diagnostic delays in AHP and other rare diseases. Robust recruitment strategies and multicenter coordination will be needed to validate these models before they can be deployed.

[1]  P. Harper,et al.  Risk of primary liver cancer in acute hepatic porphyria patients: A matched cohort study of 1244 individuals , 2022, Journal of internal medicine.

[2]  M. Sweetser,et al.  Disease burden in patients with acute hepatic porphyria: experience from the phase 3 ENVISION study , 2021, Orphanet Journal of Rare Diseases.

[3]  Anita Burgun-Parenthoine,et al.  Identification of Similar Patients Through Medical Concept Embedding from Electronic Health Records: A Feasibility Study for Rare Disease Diagnosis , 2021, MIE.

[4]  N. Jacobson,et al.  Predictive modeling of depression and anxiety using electronic health records and a novel machine learning approach with artificial intelligence , 2020, Scientific Reports.

[5]  A. Butte,et al.  Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes , 2020, npj Digital Medicine.

[6]  W. Hersh,et al.  Detecting rare diseases in electronic health records using machine learning and knowledge engineering: Case study of acute hepatic porphyria , 2020, medRxiv.

[7]  Atul J. Butte,et al.  Opportunities and challenges in using real-world data for health care. , 2020, The Journal of clinical investigation.

[8]  Fenglong Ma,et al.  Rare Disease Prediction by Generating Quality-Assured Electronic Health Records , 2020, SDM.

[9]  Jian Tang,et al.  GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding , 2019, WWW.

[10]  B. Ehsani-Moghaddam,et al.  Mucopolysaccharidosis type II detection by Naïve Bayes Classifier: An example of patient classification for a rare disease using electronic medical records from the Canadian Primary Care Sentinel Surveillance Network , 2018, PloS one.

[11]  Richard Colbaugh,et al.  Robust Ensemble Learning to Identify Rare Disease Patients from Electronic Health Records , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[12]  H. Bonkovsky,et al.  EXPLORE: A Prospective, Multinational, Natural History Study of Patients with Acute Hepatic Porphyria with Recurrent Attacks , 2018, Hepatology.

[13]  H. Bonkovsky,et al.  Acute hepatic porphyrias: Recommendations for evaluation and long‐term management , 2017, Hepatology.

[14]  A. Maguire,et al.  Identifying rare diseases using electronic medical records: the example of allergic bronchopulmonary aspergillosis , 2017, Pharmacoepidemiology and drug safety.

[15]  R. Desnick,et al.  Experiences and concerns of patients with recurrent attacks of acute hepatic porphyria: A qualitative study. , 2016, Molecular genetics and metabolism.

[16]  Siddhartha R. Jonnalagadda,et al.  A Bootstrap Machine Learning Approach to Identify Rare Disease Patients from Electronic Health Records , 2016, ArXiv.

[17]  J. Deybach,et al.  High prevalence of and potential mechanisms for chronic kidney disease in patients with acute intermittent porphyria. , 2015, Kidney international.

[18]  H. Bonkovsky,et al.  Acute porphyrias in the USA: features of 108 subjects from porphyrias consortium. , 2014, The American journal of medicine.

[19]  Halil Kilicoglu,et al.  SemMedDB: a PubMed-scale repository of biomedical semantic predications , 2012, Bioinform..

[20]  Kristin E. Porter,et al.  Diagnosing and responding to violations in the positivity assumption , 2012, Statistical methods in medical research.

[21]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[22]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[23]  C. Gray Porphyria , 1956 .

[24]  [Porphyrias]. , 1968, Giornale di clinica medica.