Statistical quantification of confounding bias in predictive modelling

The lack of non-parametric statistical tests for confounding bias significantly hampers the development of robust, valid and generalizable predictive models in many fields of research. Here I propose the partial and full confounder tests, which, for a given confounder variable, probe the null hypotheses of unconfounded and fully confounded models, respectively. The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions, often seen in machine learning. Applying the proposed tests on models trained on functional brain connectivity data from the Human Connectome Project and the Autism Brain Imaging Data Exchange dataset reveals confounders that were previously unreported or found to be hard to correct for with state-of-the-art confound mitigation approaches. The tests (implemented in the package mlconfounda) can aid the assessment and improvement of the generalizability and neurobiological validity of predictive models and, thereby, foster the development of clinically useful machine learning biomarkers.

[1]  Ewout Steyerberg,et al.  Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects , 2018, British Medical Journal.

[2]  Daniel P. Kennedy,et al.  The Autism Brain Imaging Data Exchange: Towards Large-Scale Evaluation of the Intrinsic Brain Architecture in Autism , 2013, Molecular Psychiatry.

[3]  Davide Bacciu,et al.  Measuring the effects of confounders in medical supervised classification problems: the Confounding Index (CI) , 2019, Artif. Intell. Medicine.

[4]  M. B. Nebel,et al.  Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging , 2012, Front. Syst. Neurosci..

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[7]  Nina Vogt Machine learning in neuroscience , 2018, Nature Methods.

[8]  K. Sandberg,et al.  Is whole-brain functional connectivity a neuromarker of sustained attention? Comment on Rosenberg & al. (2016) , 2017, bioRxiv.

[9]  J. Delafield-Butt,et al.  Toward the Autism Motor Signature: Gesture patterns during smart tablet gameplay identify children with autism , 2018 .

[10]  Jesse Hemerik,et al.  Exact testing with random permutations , 2014, TEST.

[11]  M. Lindquist,et al.  An fMRI-based neurologic signature of physical pain. , 2013, The New England journal of medicine.

[12]  C. Hass,et al.  Motor Coordination in Autism Spectrum Disorders: A Synthesis and Meta-Analysis , 2010, Journal of autism and developmental disorders.

[13]  Janaina Mourão Miranda,et al.  Predictive modelling using neuroimaging data in the presence of confounds , 2017, NeuroImage.

[14]  Gaël Varoquaux,et al.  Benchmarking functional connectome-based predictive models for resting-state fMRI , 2019, NeuroImage.

[15]  Tamás Spisák,et al.  Optimal choice of parameters in functional connectome-based predictive modelling might be biased by motion: comment on Dadi et al , 2019, bioRxiv.

[16]  Klaus Fiedler,et al.  What mediation analysis can (not) do , 2011 .

[17]  Thomas B. Berrett,et al.  The conditional permutation test for independence while controlling for confounders , 2018, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[18]  E. Pitman Significance Tests Which May be Applied to Samples from Any Populations , 1937 .

[19]  Mert R. Sabuncu,et al.  Deep neural networks and kernel regression achieve comparable accuracies for functional connectivity prediction of behavior and demographics , 2020, NeuroImage.

[20]  Russell T. Shinohara,et al.  Harmonization of cortical thickness measurements across scanners and sites , 2017, NeuroImage.

[21]  Michael W. Cole,et al.  Global Connectivity of Prefrontal Cortex Predicts Cognitive Control and Intelligence , 2012, The Journal of Neuroscience.

[22]  Tobias Schmidt-Wilcke,et al.  Pain-free resting-state functional brain connectivity predicts individual pain sensitivity , 2019, Nature Communications.

[23]  Wicher P. Bergsma,et al.  Nonparametric Testing of Conditional Independence by Means of the Partial Copula , 2010, 1101.4607.

[24]  Jiang Bian,et al.  Causal inference and counterfactual prediction in machine learning for actionable healthcare , 2020, Nature Machine Intelligence.

[25]  Brian W. Powers,et al.  Dissecting racial bias in an algorithm used to manage the health of populations , 2019, Science.

[26]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[27]  Silvio C.E. Tosatto,et al.  DOME: recommendations for supervised machine learning validation in biology , 2020, Nature Methods.

[28]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[29]  C. Kwak,et al.  Multinomial Logistic Regression , 2002, Nursing research.

[30]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[31]  Luke J. Chang,et al.  Building better biomarkers: brain models in translational neuroimaging , 2017, Nature Neuroscience.

[32]  Richard H. Jones,et al.  Proability estimation usind a multinominal logistic function , 1975 .

[33]  Alan C. Evans,et al.  Multi-level bootstrap analysis of stable clusters in resting-state fMRI , 2009, NeuroImage.

[34]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[35]  Predicting intelligence from fMRI data of the human brain in a few minutes of scan time , 2021 .

[36]  Lucas Janson,et al.  Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection , 2016, 1610.02351.

[37]  Timothy O. Laumann,et al.  Methods to detect, characterize, and remove motion artifact in resting state fMRI , 2014, NeuroImage.

[38]  Rajen Dinesh Shah,et al.  The hardness of conditional independence testing and the generalised covariance measure , 2018, The Annals of Statistics.

[39]  E. Korn The Ranges of Limiting Values of Some Partial Correlations under Conditional Independence , 1984 .

[40]  J. Dukart,et al.  Age Correction in Dementia – Matching to a Healthy Brain , 2011, PloS one.

[41]  T Mark Beasley,et al.  Rank-Based Inverse Normal Transformations are Increasingly Used, But are They Merited? , 2009, Behavior genetics.

[42]  Paul M. Thompson,et al.  Head Motion and Inattention/Hyperactivity Share Common Genetic Influences: Implications for fMRI Studies of ADHD , 2016, PloS one.

[43]  Anthony Rios,et al.  The risk of racial bias while tracking influenza-related content on social media using machine learning , 2021, J. Am. Medical Informatics Assoc..

[44]  Mark Jenkinson,et al.  The minimal preprocessing pipelines for the Human Connectome Project , 2013, NeuroImage.

[45]  R. Fisher THE THEORY OF CONFOUNDING IN FACTORIAL EXPERIMENTS IN RELATION TO THE THEORY OF GROUPS , 1941 .

[46]  Daniel Servén,et al.  pyGAM: Generalized Additive Models in Python , 2018 .

[47]  Stefan Klöppel,et al.  Reduction of confounding effects with voxel-wise Gaussian process regression in structural MRI , 2014, 2014 International Workshop on Pattern Recognition in Neuroimaging.

[48]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[49]  Julia M. Huntenburg,et al.  Loading and plotting of cortical surface representations in Nilearn , 2017 .

[50]  Lucinda K. Southworth,et al.  Properties of Balanced Permutations , 2009, J. Comput. Biol..

[51]  Hang Joon Jo,et al.  The perils of global signal regression for group comparisons: a case study of Autism Spectrum Disorders , 2013, Front. Hum. Neurosci..

[52]  Miklós Emri,et al.  Voxel-Wise Motion Artifacts in Population-Level Whole-Brain Connectivity Analysis of Resting-State fMRI , 2014, PloS one.

[53]  Loïc Estève Big data in practice: the example of nilearn for mining brain imaging data , 2015 .

[54]  Larsson Omberg,et al.  A Permutation Approach to Assess Confounding in Machine Learning Applications for Digital Health , 2019, KDD.

[55]  Khundrakpam Budhachandra,et al.  The Neuro Bureau Preprocessing Initiative: open sharing of preprocessed neuroimaging data and derivatives , 2013 .

[56]  Christian Wachinger,et al.  Detect and Correct Bias in Multi-Site Neuroimaging Datasets , 2020, Medical Image Anal..

[57]  Paola Galdi,et al.  A distributed brain network predicts general intelligence from resting-state human neuroimaging data , 2018, bioRxiv.

[58]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[59]  M. C. Jones,et al.  Sinh-arcsinh distributions , 2009 .

[60]  Essa Yacoub,et al.  The WU-Minn Human Connectome Project: An overview , 2013, NeuroImage.

[61]  R. Tibshirani,et al.  Generalized Additive Models: Some Applications , 1987 .