Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders

Abstract This study explores the utility of the large language models (LLMs), specifically ChatGPT and Google Bard, in predicting neuropathologic diagnoses from clinical summaries. A total of 25 cases of neurodegenerative disorders presented at Mayo Clinic brain bank Clinico‐Pathological Conferences were analyzed. The LLMs provided multiple pathologic diagnoses and their rationales, which were compared with the final clinical diagnoses made by physicians. ChatGPT‐3.5, ChatGPT‐4, and Google Bard correctly made primary diagnoses in 32%, 52%, and 40% of cases, respectively, while correct diagnoses were included in 76%, 84%, and 76% of cases, respectively. These findings highlight the potential of artificial intelligence tools like ChatGPT in neuropathology, suggesting they may facilitate more comprehensive discussions in clinicopathological conferences.

[1]  S. Koga The Potential of ChatGPT in Medical Education: Focusing on USMLE Preparation , 2023, Annals of Biomedical Engineering.

[2]  S. Koga,et al.  Brain Bank Questionnaire Helps in Differential Diagnosis of Movement Disorders: An Autopsy Study of 150 Patients , 2023, Movement disorders clinical practice.

[3]  Som S. Biswas Passing is Great: Can ChatGPT Conduct USMLE Exams? , 2023, Annals of Biomedical Engineering.

[4]  J. Ayers,et al.  Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. , 2023, JAMA internal medicine.

[5]  Tiffany H. Kung,et al.  Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models , 2022, medRxiv.

[6]  Marcel Prastawa,et al.  Artificial intelligence-derived neurofibrillary tangle burden is associated with antemortem cognitive impairment , 2022, Acta Neuropathologica Communications.

[7]  Joseph James Duffy,et al.  The many faces of globular glial tauopathy: A clinical and imaging study , 2022, European journal of neurology.

[8]  B. Dugger,et al.  Advances in Deep Neuropathological Phenotyping of Alzheimer Disease: Past, Present, and Future , 2022, Journal of neuropathology and experimental neurology.

[9]  D. Dickson,et al.  Deep learning‐based model for diagnosing Alzheimer's disease and tauopathies , 2021, Neuropathology and applied neurobiology.

[10]  Yoav Ben-Shlomo,et al.  The accuracy of diagnosis of parkinsonian syndromes in a specialist movement disorder service. , 2002, Brain : a journal of neurology.