Questioning causality on sex, gender and COVID-19, and identifying bias in large-scale data-driven analyses: the Bias Priority Recommendations and Bias Catalog for Pandemics

The COVID-19 pandemic has spurred a large amount of observational studies reporting linkages between the risk of developing severe COVID-19 or dying from it, and sex and gender. By reviewing a large body of related literature and conducting a fine grained analysis based on sex-disaggregated data of 61 countries spanning 5 continents, we discover several confounding factors that could possibly explain the supposed male vulnerability to COVID-19. We thus highlight the challenge of making causal claims based on available data, given the lack of statistical significance and potential existence of biases. Informed by our findings on potential variables acting as confounders, we contribute a broad overview on the issues bias, explainability and fairness entail in data-driven analyses. Thus, we outline a set of discriminatory policy consequences that could, based on such results, lead to unintended discrimination. To raise awareness on the dimensionality of such foreseen impacts, we have compiled an encyclopedia-like reference guide, the Bias Catalog for Pandemics (BCP), to provide definitions and emphasize realistic examples of bias in general, and within the COVID-19 pandemic context. These are categorized within a division of bias families and a 2-level priority scale, together with preventive steps. In addition, we facilitate the Bias Priority Recommendations on how to best use and apply this catalog, and provide guidelines in order to address real world research questions. The objective is to anticipate and avoid disparate impact and 1 ar X iv :2 10 4. 14 49 2v 1 [ cs .C Y ] 2 9 A pr 2 02 1 discrimination, by considering causality, explainability, bias and techniques to mitigate the latter. With these, we hope to 1) contribute to designing and conducting fair and equitable data-driven studies and research; and 2) interpret and draw meaningful and actionable conclusions from these.

[1]  Lewis Bott,et al.  Claims of causality in health news: a randomised trial , 2019, BMC Medicine.

[2]  C. Norris,et al.  The influence of sex and gender domains on COVID-19 cases and mortality , 2020, Canadian Medical Association Journal.

[3]  Steffen Staab,et al.  Bias in data‐driven artificial intelligence systems—An introductory survey , 2020, WIREs Data Mining Knowl. Discov..

[4]  Emre Kıcıman,et al.  Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries , 2018, Front. Big Data.

[5]  E. S. Vorm,et al.  Enhancing human-machine teaming for medical prognosis through neural ordinary differential equations (NODEs) , 2021, Human-Intelligent Systems Integration.

[6]  Hendrik Jürges,et al.  Causal inference from observational data. , 2016, Community dentistry and oral epidemiology.

[7]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[8]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[9]  G. Bwire Coronavirus: Why Men are More Vulnerable to Covid-19 Than Women? , 2020, SN Comprehensive Clinical Medicine.

[10]  Z. Wang,et al.  A comparison study of SARS‐CoV‐2 IgG antibody between male and female COVID‐19 patients: A possible reason underlying different outcome between sex , 2020, Journal of medical virology.

[11]  K. Usher,et al.  Health vulnerabilities of readymade garment (RMG) workers: a systematic review , 2019, BMC Public Health.

[12]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[13]  Trevor Darrell,et al.  Women also Snowboard: Overcoming Bias in Captioning Models , 2018, ECCV.

[14]  J. Bulchand-Gidumal,et al.  A collective risk dilemma for tourism restrictions under the COVID-19 context , 2021, Scientific Reports.

[15]  J. Sterne,et al.  Comparison of treatment effect sizes associated with surrogate and final patient relevant outcomes in randomised controlled trials: meta-epidemiological study , 2013, BMJ.

[16]  Georg Langs,et al.  Causability and explainability of artificial intelligence in medicine , 2019, WIREs Data Mining Knowl. Discov..

[17]  Iain Chalmers,et al.  Recognizing, investigating and dealing with incomplete and biased reporting of clinical research: from Francis Bacon to the WHO , 2011, Journal of the Royal Society of Medicine.

[18]  H. Peckham,et al.  Male sex identified by global COVID-19 meta-analysis as a risk factor for death and ITU admission , 2020, Nature Communications.

[19]  Garima Sharma,et al.  Sex Differences in Mortality From COVID-19 Pandemic , 2020, JACC: Case Reports.

[20]  From predictions to prescriptions: A data-driven response to COVID-19 , 2021, Health care management science.

[21]  Peter M. Aronow,et al.  The Book of Why: The New Science of Cause and Effect , 2020, Journal of the American Statistical Association.

[22]  Ilia Stepin,et al.  A Survey of Contrastive and Counterfactual Explanation Generation Methods for Explainable Artificial Intelligence , 2021, IEEE Access.

[23]  Andreas Holzinger,et al.  Measuring the Quality of Explanations: The System Causability Scale (SCS) , 2020, KI - Künstliche Intelligenz.

[24]  A. Salonia,et al.  Low testosterone levels predict clinical adverse outcomes in SARS‐CoV‐2 pneumonia patients , 2020, The Journal of Sexual Medicine.

[25]  Ross J. Harris,et al.  Influence of reported study design characteristics on intervention effect estimates from randomised controlled trials: combined analysis of meta-epidemiological studies. , 2012, Health technology assessment.

[26]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[27]  S. Pääbo,et al.  The major genetic risk factor for severe COVID-19 is inherited from Neanderthals , 2020, Nature.

[28]  H. Mouquet,et al.  Sex differences in the decline of neutralizing antibodies to SARS-CoV-2 , 2020, medRxiv.

[29]  J. Lewnard,et al.  The effect of school closures and reopening strategies on COVID-19 infection dynamics in the San Francisco Bay Area: a cross-sectional survey and modeling analysis , 2020, medRxiv.

[30]  Steven A Greenberg,et al.  How citation distortions create unfounded authority: analysis of a citation network , 2009, BMJ : British Medical Journal.

[31]  Sofia B. Ahmed,et al.  Sex, gender and COVID-19: a call to action , 2020, Canadian Journal of Public Health.

[32]  Mélanie Frappier,et al.  The Book of Why: The New Science of Cause and Effect , 2018, Science.

[33]  Yoshua Bengio,et al.  Saliency is a Possible Red Herring When Diagnosing Poor Generalization , 2021, ICLR.

[34]  G. De Pergola,et al.  Worse progression of COVID‐19 in men: Is testosterone a key factor? , 2020, Andrology.

[35]  Jorge Casillas,et al.  How fair can we go in machine learning? Assessing the boundaries of accuracy and fairness , 2020, Int. J. Intell. Syst..

[36]  N. Tatonetti,et al.  Associations between blood type and COVID-19 infection, intubation, and death , 2020, Nature Communications.

[37]  Francisco Herrera,et al.  Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI , 2020, Inf. Fusion.

[38]  J. Pearl Statistics and causal inference: A review , 2003 .

[39]  Brian W. Powers,et al.  Dissecting racial bias in an algorithm used to manage the health of populations , 2019, Science.

[40]  A. Szabó,et al.  Could attitudes toward COVID-19 in Spain render men more vulnerable than women? , 2020, Global public health.

[41]  N. Tatonetti,et al.  Testing the association between blood type and COVID-19 infection, intubation, and death , 2020, medRxiv.

[42]  Alaa M Althubaiti,et al.  Information bias in health research: definition, pitfalls, and adjustment methods , 2016, Journal of multidisciplinary healthcare.

[43]  Enrique Fernández-Macías,et al.  The COVID confinement measures and EU labour markets , 2020 .

[44]  Joris A H de Groot,et al.  Verification problems in diagnostic accuracy studies: consequences and solutions , 2011, BMJ : British Medical Journal.

[45]  S. Klein,et al.  Biological sex impacts COVID-19 outcomes , 2020, PLoS pathogens.

[46]  K. Telle,et al.  COVID-19 among bartenders and waiters before and after pub lockdown , 2021, medRxiv.

[47]  James Zou,et al.  AI can be sexist and racist — it’s time to make it fair , 2018, Nature.

[48]  Luigi Gresele,et al.  Simpson's Paradox in COVID-19 Case Fatality Rates: A Mediation Analysis of Age-Related Causal Effects , 2020, IEEE Transactions on Artificial Intelligence.

[49]  F. Song,et al.  Dissemination and publication of research findings: an updated review of related biases. , 2010, Health technology assessment.

[50]  Kate Power The COVID-19 pandemic has increased the care burden of women and families , 2020, Sustainability: Science, Practice and Policy.

[51]  Impact of sex and gender on COVID-19 outcomes in Europe , 2020, Biology of Sex Differences.

[52]  Krikamol Muandet,et al.  Fair Decisions Despite Imperfect Predictions , 2019, AISTATS.

[53]  Andrew Pekosz,et al.  The Xs and Y of immune responses to viral vaccines. , 2010, The Lancet. Infectious diseases.

[54]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[55]  R. Morgan,et al.  COVID-19: the gendered impacts of the outbreak , 2020, The Lancet.

[56]  Anna Saranti,et al.  Towards multi-modal causability with Graph Neural Networks enabling information fusion for explainable AI , 2021, Inf. Fusion.

[57]  K. Mahtani ‘Spin’ in reports of clinical research , 2016, Evidence-Based Medicine.

[58]  D. Sackett Bias in analytic research. , 1979, Journal of chronic diseases.

[59]  R. Alizadeh-Navaei,et al.  Relationship between blood group and risk of infection and death in COVID-19: a live meta-analysis , 2020, New Microbes and New Infections.

[60]  S. Tabik,et al.  COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images , 2020, IEEE Journal of Biomedical and Health Informatics.

[61]  Pimpan Silpasuwan,et al.  Cotton Dust Exposure and Resulting Respiratory Disorders Among Home-Based Garment Workers , 2016, Workplace health & safety.

[62]  Ignacio N. Cofone,et al.  Algorithmic Discrimination Is an Information Problem , 2019 .

[63]  Ivan Donadello,et al.  EXplainable Neural-Symbolic Learning (X-NeSyL) methodology to fuse deep learning representations with expert knowledge graphs: the MonuMAI cultural heritage use case , 2021, Inf. Fusion.

[64]  R. Kitchin,et al.  Big Data, new epistemologies and paradigm shifts , 2014, Big Data Soc..

[65]  M. Haischer,et al.  Who is wearing a mask? Gender-, age-, and location-related differences during the COVID-19 pandemic , 2020, medRxiv.

[66]  D. Eadie,et al.  Managing COVID-19 Transmission Risks in Bars: An Interview and Observation Study. , 2021, Journal of studies on alcohol and drugs.

[67]  D. Mozaffarian,et al.  Coronavirus Disease 2019 Hospitalizations Attributable to Cardiometabolic Conditions in the United States: A Comparative Risk Assessment Analysis , 2021, Journal of the American Heart Association.

[68]  Loris Nanni,et al.  A critic evaluation of methods for COVID-19 automatic detection from X-ray images , 2020, Information Fusion.

[69]  A. Montano‐Loza,et al.  Perspective: improving vitamin D status in the management of COVID-19 , 2020, European Journal of Clinical Nutrition.

[70]  H. Goyal,et al.  Racial and Gender-Based Differences in COVID-19 , 2020, Frontiers in Public Health.

[71]  Ifeoma Ajunwa,et al.  The Paradox of Automation as Anti-Bias Intervention , 2016 .

[72]  Jure Leskovec,et al.  Mobility network models of COVID-19 explain inequities and inform reopening , 2020, Nature.

[73]  A. Jain,et al.  Analysis of vitamin D level among asymptomatic and critically ill COVID-19 patients and its correlation with inflammatory markers , 2020, Scientific Reports.

[74]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.