Assessing socioeconomic bias in machine learning algorithms in health care: a case study of the HOUSES index

OBJECTIVE Artificial intelligence (AI) models may propagate harmful biases in performance and hence negatively affect the underserved. We aimed to assess the degree to which data quality of electronic health records (EHRs) affected by inequities related to low socioeconomic status (SES), results in differential performance of AI models across SES. MATERIALS AND METHODS This study utilized existing machine learning models for predicting asthma exacerbation in children with asthma. We compared balanced error rate (BER) against different SES levels measured by HOUsing-based SocioEconomic Status measure (HOUSES) index. As a possible mechanism for differential performance, we also compared incompleteness of EHR information relevant to asthma care by SES. RESULTS Asthmatic children with lower SES had larger BER than those with higher SES (eg, ratio = 1.35 for HOUSES Q1 vs Q2-Q4) and had a higher proportion of missing information relevant to asthma care (eg, 41% vs 24% for missing asthma severity and 12% vs 9.8% for undiagnosed asthma despite meeting asthma criteria). DISCUSSION Our study suggests that lower SES is associated with worse predictive model performance. It also highlights the potential role of incomplete EHR data in this differential performance and suggests a way to mitigate this bias. CONCLUSION The HOUSES index allows AI researchers to assess bias in predictive model performance by SES. Although our case study was based on a small sample size and a single-site study, the study results highlight a potential strategy for identifying bias by using an innovative SES measure.

[1]  R. Grad,et al.  Are primary care and continuity of care associated with asthma-related acute outcomes amongst children? A retrospective population-based study , 2022, BMC Primary Care.

[2]  Hongfang Liu,et al.  Artificial intelligence-assisted clinical decision support for childhood asthma management: A randomized clinical trial , 2021, PloS one.

[3]  C. Patten,et al.  Role of Geographic Risk Factors in COVID-19 Epidemiology: Longitudinal Geospatial Analysis , 2021, Mayo Clinic Proceedings: Innovations, Quality & Outcomes.

[4]  Jonathan W. Inselman,et al.  Artificial intelligence–enabled electrocardiograms for identification of patients with low ejection fraction: a pragmatic, randomized clinical trial , 2021, Nature Medicine.

[5]  Moninder Singh,et al.  Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression , 2021, JAMA network open.

[6]  G. Calip,et al.  Association of Race/Ethnicity and the 21-Gene Recurrence Score With Breast Cancer-Specific Mortality Among US Women. , 2021, JAMA oncology.

[7]  A. Bush,et al.  Burden of preschool wheeze and progression to asthma in the UK: population-based cohort 2007 to 2017. , 2021, Journal of Allergy and Clinical Immunology.

[8]  E. Pierson,et al.  An algorithmic approach to reducing unexplained pain disparities in underserved populations , 2021, Nature Medicine.

[9]  Kadija Ferryman,et al.  Addressing health disparities in the Food and Drug Administration's artificial intelligence and machine learning regulatory framework , 2020, J. Am. Medical Informatics Assoc..

[10]  D. Belsky,et al.  Social determinants of health and survival in humans and other animals , 2020, Science.

[11]  P. Vineis,et al.  Association of Parental Socioeconomic Status and Newborn Telomere Length , 2020, JAMA network open.

[12]  D. Zahrieh,et al.  Mobile home residence as a risk factor for adverse events among children in a mixed rural–urban community: A case for geospatial analysis , 2020, Journal of Clinical and Translational Science.

[13]  B. Meskó,et al.  The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database , 2020, npj Digital Medicine.

[14]  T. Beebe,et al.  HOUSES Index as an Innovative Socioeconomic Measure Predicts Graft Failure Among Kidney Transplant Recipients , 2020, Transplantation.

[15]  Brian W. Powers,et al.  Dissecting racial bias in an algorithm used to manage the health of populations , 2019, Science.

[16]  Yunfeng Zhang,et al.  AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias , 2019, IBM Journal of Research and Development.

[17]  E. Ryu,et al.  Spatio-temporal comparison of pertussis outbreaks in Olmsted County, Minnesota, 2004–2005 and 2012: a population-based study , 2019, BMJ Open.

[18]  E. Ryu,et al.  Epidemiology of Children With Multiple Complex Chronic Conditions in a Mixed Urban-Rural US Community. , 2019, Hospital pediatrics.

[19]  M. Ghassemi,et al.  Can AI Help Reduce Disparities in General Medical and Mental Health Care? , 2019, AMA journal of ethics.

[20]  P. Lantz,et al.  The Medicalization of Population Health: Who Will Stay Upstream? , 2018, The Milbank quarterly.

[21]  M. Howell,et al.  Ensuring Fairness in Machine Learning to Advance Health Equity , 2018, Annals of Internal Medicine.

[22]  Michael Lawrence Barnett,et al.  Assessment of the Effect of Adjustment for Patient Characteristics on Hospital Readmission Rates: Implications for Pay for Performance , 2018, JAMA internal medicine.

[23]  Jessica S. Ancker,et al.  Good intentions are not enough: how informatics interventions can worsen inequality , 2018, J. Am. Medical Informatics Assoc..

[24]  William R. Buckingham,et al.  Making Neighborhood-Disadvantage Metrics Accessible - The Neighborhood Atlas. , 2018, The New England journal of medicine.

[25]  J. Cerhan,et al.  Association between an individual housing-based socioeconomic index and inconsistent self-reporting of health conditions: a prospective cohort study in the Mayo Clinic Biobank , 2018, BMJ Open.

[26]  M. Clark,et al.  Social Determinants of Health in Managed Care Payment Formulas , 2017, JAMA internal medicine.

[27]  D. Baker,et al.  Holding Providers Accountable for Health Care Outcomes , 2017, Annals of Internal Medicine.

[28]  D. Belsky,et al.  Invited Commentary: Integrating Genomics and Social Epidemiology-Analysis of Late-Life Low Socioeconomic Status and the Conserved Transcriptional Response to Adversity. , 2017, American journal of epidemiology.

[29]  Sunghwan Sohn,et al.  Application of a Natural Language Processing Algorithm to Asthma Ascertainment. An Automated Chart Review , 2017, American journal of respiratory and critical care medicine.

[30]  E. Zerhouni,et al.  Vital Directions for Health and Health Care: Priorities From a National Academy of Medicine Initiative , 2017, JAMA.

[31]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[32]  James VanDerslice,et al.  Introduction of an Area Deprivation Index Measuring Patient Socioeconomic Status in an Integrated Health System: Implications for Population Health , 2016, EGEMS.

[33]  Judith W. Dexheimer,et al.  Asthma-related emergency department use: current perspectives , 2016, Open access emergency medicine : OAEM.

[34]  V. Montori,et al.  Pursuing minimally disruptive medicine: disruption from illness and health care-related demands is correlated with patient capacity. , 2016, Journal of clinical epidemiology.

[35]  Samir S. Shah,et al.  Association of Social Determinants With Children's Hospitals' Preventable Readmissions Performance. , 2016, JAMA pediatrics.

[36]  Tonya S. King,et al.  Income is an independent risk factor for worse asthma outcomes , 2016, The Journal of allergy and clinical immunology.

[37]  Alan E. Simon,et al.  Changing Trends in Asthma Prevalence Among Children , 2016, Pediatrics.

[38]  J. Sloan,et al.  Concordance between Individual vs. Area-Level Socioeconomic Measures in an Urban Setting , 2015, Journal of health care for the poor and underserved.

[39]  N. Shah,et al.  Effect of Multiple Chronic Diseases on Health Care Expenditures in Childhood , 2015, Journal of primary care & community health.

[40]  Chris Feudtner,et al.  Pediatric complex chronic conditions classification system version 2: updated for ICD-10 and complex medical technology dependence and transplantation , 2014, BMC Pediatrics.

[41]  A. Zaslavsky,et al.  Quality reporting that addresses disparities in health care. , 2014, JAMA.

[42]  B. Yawn,et al.  Application of a novel socioeconomic measure using individual housing data in asthma research: an exploratory study , 2014, npj Primary Care Respiratory Medicine.

[43]  Karin E. Johnson,et al.  A Conceptual Model of the Role of Complexity in the Care of Patients With Multiple Chronic Conditions , 2014, Medical care.

[44]  J. Morenoff,et al.  Cumulative Exposure to Neighborhood Context , 2014, Research on aging.

[45]  K. E. Ravikumar,et al.  Automated chart review for asthma cohort identification using natural language processing: an exploratory study. , 2013, Annals of allergy, asthma & immunology : official publication of the American College of Allergy, Asthma, & Immunology.

[46]  B. Yawn,et al.  Comparison of individual-level versus area-level socioeconomic measures in assessing health outcomes of children in Olmsted County, Minnesota , 2013, Journal of Epidemiology & Community Health.

[47]  F. Mair,et al.  Cumulative complexity: a functional, patient-centered model of patient complexity can improve research and practice. , 2012, Journal of clinical epidemiology.

[48]  L. Casalino,et al.  Do Physician Organizations Located in Lower Socioeconomic Status Areas Score Lower on Pay-for-Performance Measures? , 2012, Journal of General Internal Medicine.

[49]  B. Yawn,et al.  Generalizability of epidemiological findings and public health decisions: an illustration from the Rochester Epidemiology Project. , 2012, Mayo Clinic proceedings.

[50]  B. Yawn,et al.  Development and Initial Testing of a New Socioeconomic Status Measure Based on Housing Data , 2011, Journal of Urban Health.

[51]  Glenn Flores,et al.  Urban Minority Children with Asthma: Substantial Morbidity, Compromised Quality and Access to Specialists, and the Importance of Poverty and Specialty Care , 2009, The Journal of asthma : official journal of the Association for the Care of Asthma.

[52]  Nancy Breen,et al.  Approaching health disparities from a population perspective: the National Institutes of Health Centers for Population Health and Health Disparities. , 2008, American journal of public health.

[53]  Harlan M. Krumholz,et al.  Influence of Patients’ Socioeconomic Status on Clinical Management Decisions: A Qualitative Study , 2008, The Annals of Family Medicine.

[54]  Hans Bisgaard,et al.  Prevalence of asthma‐like symptoms in young children , 2007, Pediatric pulmonology.

[55]  A. Geronimus Invited commentary: Using area-based socioeconomic measures--think conceptually, act cautiously. , 2006, American journal of epidemiology.

[56]  Bruce G. Link,et al.  Controlling Disease and Creating Disparities: A Fundamental Cause Perspective. , 2005, The journals of gerontology. Series B, Psychological sciences and social sciences.

[57]  Deborah Schrag,et al.  Primary care physicians who treat blacks and whites. , 2004, The New England journal of medicine.

[58]  P. Rossi,et al.  The measurement of SES in health research: current practice and steps toward a new approach. , 2003, Social science & medicine.

[59]  K. Fiscella,et al.  Effect of Patient Socioeconomic Status on Physician Profiles for Prevention, Disease Management, and Diagnostic Testing Costs , 2002, Medical care.

[60]  B. Yawn,et al.  A longitudinal study of the prevalence of asthma in a community population of school-age children. , 2002, The Journal of pediatrics.

[61]  Katherine Newman,et al.  Socioeconomic disparities in health: pathways and policies. , 2002, Health affairs.

[62]  J. House,et al.  Socioeconomic disparities in health change in a longitudinal study of US adults: the role of health-risk behaviors. , 2001, Social science & medicine.

[63]  Mir H. Ali,et al.  Natural Language Processing for Asthma Ascertainment in Different Practice Settings. , 2018, The journal of allergy and clinical immunology. In practice.

[64]  Scott M. Brue,et al.  Data Resource Profile Data Resource Profile: the Rochester Epidemiology Project (rep) Medical Records-linkage System Data Resource Basics , 2022 .