论文信息 - Joint Fairness Model with Applications to Risk Predictions for Under-represented Populations

Joint Fairness Model with Applications to Risk Predictions for Under-represented Populations

Under-representation of certain populations, based on gender, race/ethnicity, and age, in data collection for predictive modeling may yield less-accurate predictions for the under-represented groups. Recently, this issue of fairness in predictions has attracted significant attention, as data-driven models are increasingly utilized to perform crucial decision-making tasks. Methods to achieve fairness in the machine learning literature typically build a single prediction model subject to some fairness criteria in a manner that encourages fair prediction performances for all groups. These approaches have two major limitations: i) fairness is often achieved by compromising accuracy for some groups; ii) the underlying relationship between dependent and independent variables may not be the same across groups. We propose a Joint Fairness Model (JFM) approach for binary outcomes that estimates group-specific classifiers using a joint modeling objective function that incorporates fairness criteria for prediction. We introduce an Accelerated Smoothing Proximal Gradient Algorithm to solve the convex objective function, and demonstrate the properties of the proposed JFM estimates. Next, we presented the key asymptotic properties for the JFM parameter estimates. We examined the efficacy of the JFM approach in achieving prediction performances and parities, in comparison with the Single Fairness Model, group-separate model, and group-ignorant model through extensive simulations. Finally, we demonstrated the utility of the JFM method in the motivating example to obtain fair risk predictions for under-represented older patients diagnosed with coronavirus disease 2019 (COVID-19).

Padhraic Smyth | Preston Putzel | Shinjini Nandi | Hyungrok Do | Judy Zhong

[1] Anand Srivastava,et al. Factors Associated With Death in Critically Ill Patients With Coronavirus Disease 2019 in the US. , 2020, JAMA internal medicine.

[2] Patrick Danaher,et al. The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[3] Krishna P. Gummadi,et al. Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[4] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[5] F. Herrmann,et al. Predictors of In-Hospital Mortality in Older Patients With COVID-19: The COVIDAge Study , 2020, Journal of the American Medical Directors Association.

[6] Centers for Disease Control and Prevention CDC COVID-19 Response Team. Severe Outcomes Among Patients with Coronavirus Disease 2019 (COVID-19) — United States, February 12–March 16, 2020 , 2020, MMWR. Morbidity and mortality weekly report.

[7] Franck Picard,et al. Adaptive Generalized Fused-Lasso: Asymptotic Properties and Applications , 2013 .

[8] S. Tamang,et al. Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data , 2018, JAMA internal medicine.

[9] Krishna P. Gummadi,et al. iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[10] Krishna P. Gummadi,et al. Fairness Constraints: A Flexible Approach for Fair Classification , 2019, J. Mach. Learn. Res..

[11] K. Jones,et al. COVID‐19 and Older Adults: What We Know , 2020, Journal of the American Geriatrics Society.

[12] Yu Tao,et al. Risk Factors for Mortality in 244 Older Adults With COVID‐19 in Wuhan, China: A Retrospective Study , 2020, Journal of the American Geriatrics Society.

[13] R. Tibshirani,et al. PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[14] J. Xiang,et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study , 2020, The Lancet.

[15] Toniann Pitassi,et al. Learning Fair Representations , 2013, ICML.

[16] Wenjiang J. Fu,et al. Asymptotics for lasso-type estimators , 2000 .

[17] Harlan M Krumholz,et al. Participation in cancer clinical trials: race-, sex-, and age-based disparities. , 2004, JAMA.

[18] Krishna P. Gummadi,et al. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[19] Adam Tauman Kalai,et al. Decoupled Classifiers for Group-Fair and Efficient Machine Learning , 2017, FAT.

[20] R. Tibshirani,et al. Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[21] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[22] J. Frans,et al. Frailty and Mortality in Hospitalized Older Adults With COVID-19: Retrospective Observational Study , 2020, Journal of the American Medical Directors Association.

[23] Jun Sakuma,et al. Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[24] D. Freedman,et al. Body Mass Index and Risk for COVID-19–Related Hospitalization, Intensive Care Unit Admission, Invasive Mechanical Ventilation, and Death — United States, March–December 2020 , 2021, MMWR. Morbidity and mortality weekly report.

[25] Jonathan H Seltzer,et al. Underrepresentation of women, elderly patients, and racial minorities in the randomized trials used for cardiovascular guidelines. , 2014, JAMA internal medicine.

[26] Luca Oneto,et al. Taking Advantage of Multitask Learning for Fair Classification , 2018, AIES.

[27] Toon Calders,et al. Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[28] Edward A. Chow,et al. The Disparate Impact of Diabetes on Racial/Ethnic Minority Populations , 2012, Clinical Diabetes.

[29] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .

[30] Frank Dondelinger,et al. The joint lasso: high-dimensional regression for group structured data , 2018, Biostatistics.

[31] Ben Taskar,et al. Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[32] Katrina Ligett,et al. Penalizing Unfairness in Binary Classification , 2017 .

[33] Hee Jung Ryu,et al. InclusiveFaceNet: Improving Face Attribute Detection with Race and Gender Diversity , 2017 .

[34] V. Mor,et al. Risk Factors Associated With All-Cause 30-Day Mortality in Nursing Home Residents With COVID-19. , 2021, JAMA internal medicine.

[35] Stephen P. Boyd,et al. CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..

[36] Holger Hoefling. A Path Algorithm for the Fused Lasso Signal Approximator , 2009, 0910.0526.

[37] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.

[38] 丸山徹. Convex Analysisの二,三の進展について , 1977 .

[39] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .

[40] COVID-19 mortality risk factors in older people in a long-term care center , 2020, European Geriatric Medicine.

[41] M. Yuan,et al. Model selection and estimation in the Gaussian graphical model , 2007 .

[42] N. Shah,et al. Implementing Machine Learning in Health Care - Addressing Ethical Challenges. , 2018, The New England journal of medicine.

[43] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..

[44] Anthony F. Heath,et al. Equality of Opportunity , 2017 .

[45] Toon Calders,et al. Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[46] J. Jakobsson,et al. Risk factors for death in adult COVID-19 patients: Frailty predicts fatal outcome in older patients , 2020, International Journal of Infectious Diseases.

[47] Xi Chen,et al. Smoothing proximal gradient method for general structured sparse regression , 2010, The Annals of Applied Statistics.

[48] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.