Measuring the effects of confounders in medical supervised classification problems: the Confounding Index (CI)

Over the years, there has been growing interest in using machine learning techniques for biomedical data processing. When tackling these tasks, one needs to bear in mind that biomedical data depends on a variety of characteristics, such as demographic aspects (age, gender, etc.) or the acquisition technology, which might be unrelated with the target of the analysis. In supervised tasks, failing to match the ground truth targets with respect to such characteristics, called confounders, may lead to very misleading estimates of the predictive performance. Many strategies have been proposed to handle confounders, ranging from data selection, to normalization techniques, up to the use of training algorithm for learning with imbalanced data. However, all these solutions require the confounders to be known a priori. To this aim, we introduce a novel index that is able to measure the confounding effect of a data attribute in a bias-agnostic way. This index can be used to quantitatively compare the confounding effects of different variables and to inform correction methods such as normalization procedures or ad-hoc-prepared learning algorithms. The effectiveness of this index is validated on both simulated data and real-world neuroimaging data.

[1]  Larsson Omberg,et al.  Detecting the impact of subject characteristics on machine learning-based diagnostic applications , 2019, npj Digital Medicine.

[2]  Elias Chaibub Neto Using permutations to quantify and correct for confounding in machine learning predictions , 2018 .

[3]  Konrad P. Kording,et al.  The need to approximate the use-case in clinical machine learning , 2017, GigaScience.

[4]  Larsson Omberg,et al.  Detecting confounding due to subject identification in clinical machine learning diagnostic applications: a permutation test approach , 2017 .

[5]  H. Morgenstern,et al.  Confounding in health research. , 2001, Annual review of public health.

[6]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[7]  J. Pearl Causal inference in statistics: An overview , 2009 .

[8]  Nancy Kanwisher,et al.  Spurious group differences due to head motion in a diffusion MRI study , 2013, NeuroImage.

[9]  Daniel P. Kennedy,et al.  Enhancing studies of the connectome in autism using the autism brain imaging data exchange II , 2017, Scientific Data.

[10]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[11]  Marcus A. Badgeley,et al.  Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study , 2018, PLoS medicine.

[12]  Daniel P. Kennedy,et al.  The Autism Brain Imaging Data Exchange: Towards Large-Scale Evaluation of the Intrinsic Brain Architecture in Autism , 2013, Molecular Psychiatry.

[13]  N. Graham,et al.  Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation , 2002 .

[14]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[15]  Anders M. Dale,et al.  Sequence-independent segmentation of magnetic resonance images , 2004, NeuroImage.

[16]  Stefan Klöppel,et al.  Reduction of confounding effects with voxel-wise Gaussian process regression in structural MRI , 2014, 2014 International Workshop on Pattern Recognition in Neuroimaging.

[17]  Daniel Lemire,et al.  Scale-Based Monotonicity Analysis in Qualitative Modelling with Flat Segments , 2005, IJCAI.

[18]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[19]  B. Bot,et al.  Using permutations to assess confounding in machine learning applications for digital health. , 2018, 1811.11920.

[20]  José M. F. Moura,et al.  Multiple Source Domain Adaptation with Adversarial Learning , 2018, ICLR.

[21]  A. Dale,et al.  Whole Brain Segmentation Automated Labeling of Neuroanatomical Structures in the Human Brain , 2002, Neuron.

[22]  Bruce Fischl,et al.  FreeSurfer , 2012, NeuroImage.

[23]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[24]  Jeffrey T Leek,et al.  Statistical Applications in Genetics and Molecular Biology The practical effect of batch on genomic prediction , 2012 .

[25]  Larsson Omberg,et al.  Learning Disease vs Participant Signatures: a permutation test approach to detect identity confounding in machine learning diagnostic applications , 2017 .

[26]  Saori C. Tanaka,et al.  Harmonization of resting-state functional MRI data across multiple imaging sites via the separation of site differences into sampling bias and measurement bias , 2018, bioRxiv.

[27]  Guillaume Auzias,et al.  On the Influence of Confounding Factors in Multisite Brain Morphometry Studies of Developmental Pathologies: Application to Autism Spectrum Disorder , 2016, IEEE Journal of Biomedical and Health Informatics.

[28]  Hugues Bersini,et al.  Batch effect removal methods for microarray gene expression data integration: a survey , 2013, Briefings Bioinform..

[29]  Charlotte Soneson,et al.  Batch Effect Confounding Leads to Strong Bias in Performance Estimates Obtained by Cross-Validation , 2014, PloS one.

[30]  Russell Greiner,et al.  ADHD-200 Global Competition: diagnosing ADHD using personal characteristic data can outperform resting state fMRI measurements , 2012, Front. Syst. Neurosci..

[32]  Yufeng Zang,et al.  Spontaneous Brain Activity in the Default Mode Network Is Sensitive to Different Resting-State Conditions with Limited Cognitive Load , 2009, PloS one.

[33]  J. Dukart,et al.  Age Correction in Dementia – Matching to a Healthy Brain , 2011, PloS one.

[34]  E. C. Neto,et al.  Using permutations to detect, quantify and correct for confounding in machine learning predictions , 2018 .

[35]  Jeffrey T Leek,et al.  On the design and analysis of gene expression studies in human populations , 2007, Nature Genetics.

[36]  Eric P. Xing,et al.  Removing Confounding Factors Associated Weights in Deep Neural Networks Improves the Prediction Accuracy for Healthcare Applications , 2018, bioRxiv.

[37]  Jeffrey T. Leek,et al.  Removing batch effects for prediction problems with frozen surrogate variable analysis , 2013, PeerJ.

[38]  Arno Klein,et al.  101 Labeled Brain Images and a Consistent Human Cortical Labeling Protocol , 2012, Front. Neurosci..

[39]  Janaina Mourão Miranda,et al.  Predictive modelling using neuroimaging data in the presence of confounds , 2017, NeuroImage.

[40]  Andreas Scherer,et al.  Batch Effects and Noise in Microarray Experiments: Sources and Solutions , 2009 .

[41]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[42]  Janaina Mourão Miranda,et al.  A Comparison of Strategies for Incorporating Nuisance Variables into Predictive Neuroimaging Models , 2015, 2015 International Workshop on Pattern Recognition in NeuroImaging.

[43]  J. Pearl Causal diagrams for empirical research , 1995 .

[44]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[45]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[46]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.