Hierachical Resampling for Bagging in Multi-Study Prediction with Applications to Human Neurochemical Sensing

Prediction settings with multiple studies have become increasingly common. Ensembling models trained on individual studies has been shown to improve replicability in new studies. Motivated by a groundbreaking new technology in human neuroscience, we introduce two generalizations of multi-study ensemble predictions. First, while existing methods weight ensemble elements by cross-study prediction performance, we extend weighting schemes to also incorporate covariate similarity between training data and target validation studies. Second, we introduce a hierarchical resampling scheme to generate pseudo-study replicates ("study straps") and ensemble classifiers trained on these rather than the original studies themselves. We demonstrate analytically that existing methods are special cases. Through a tuning parameter, our approach forms a continuum between merging all training data and training with existing multi-study ensembles. Leveraging this continuum helps accommodate different levels of between-study heterogeneity. Our methods are motivated by the application of Voltammetry in humans. This technique records electrical brain measurements and converts signals into neurotransmitter concentration estimates using a chemometric prediction model. Using this model in practice presents a cross-study challenge, for which we show marked improvements after application of our methods. We verify our methods in simulations and provide the 9studyStrap9 R package.

[1]  A. Engel,et al.  Invasive recordings from the human brain: clinical insights and beyond , 2005, Nature Reviews Neuroscience.

[2]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[3]  Gregory Ditzler,et al.  An ensemble based incremental learning framework for concept drift and class imbalance , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[4]  Michael L Platt,et al.  Dopamine: Context and counterfactuals , 2015, Proceedings of the National Academy of Sciences.

[5]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[6]  Terry Lohrenz,et al.  Sub-Second Dopamine Detection in Human Striatum , 2011, PloS one.

[7]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[8]  Robi Polikar,et al.  Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach , 2008, 2008 19th International Conference on Pattern Recognition.

[9]  R. Wise,et al.  The dopamine motive system: implications for drug and food addiction , 2017, Nature Reviews Neuroscience.

[10]  Peter Dayan,et al.  The Protective Action Encoding of Serotonin Transients in the Human Brain , 2018, Neuropsychopharmacology.

[11]  Karl J. Friston,et al.  Computational psychiatry , 2012, Trends in Cognitive Sciences.

[12]  Prasad Patil,et al.  Tree-Weighting for Multi-Study Ensemble Learners , 2019, bioRxiv.

[13]  Girijesh Prasad,et al.  Covariate shift estimation based adaptive ensemble learning for handling non-stationarity in motor imagery related EEG-based brain-computer interface , 2018, Neurocomputing.

[14]  Prasad Patil,et al.  Merging versus Ensembling in Multi-Study Machine Learning: Theoretical Insight from Random Effects , 2019, ArXiv.

[15]  P. Phillips,et al.  Subsecond dopamine fluctuations in human striatum encode superposed error signals about actual and counterfactual reward , 2015, Proceedings of the National Academy of Sciences.

[16]  Prasad Patil,et al.  Training replicable predictors in multiple studies , 2018, Proceedings of the National Academy of Sciences.

[17]  Nathan T. Rodeberg,et al.  Hitchhiker's Guide to Voltammetry: Acute and Chronic Electrodes for in Vivo Fast-Scan Cyclic Voltammetry , 2017, ACS chemical neuroscience.

[18]  Lorenzo Trippa,et al.  Bayesian nonparametric cross-study validation of prediction methods , 2015, 1506.00474.

[19]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[20]  Christian Klaes,et al.  Invasive Brain-Computer Interfaces and Neural Recordings From Humans , 2018 .

[21]  Elisa Bertino,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010 .

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[23]  Beate Ritz,et al.  Cluster-based bagging of constrained mixed-effects models for high spatiotemporal resolution nitrogen oxides prediction over large regions. , 2019, Environment international.

[24]  Enrico Zio,et al.  A Novel Concept Drift Detection Method for Incremental Learning in Nonstationary Environments , 2020, IEEE Transactions on Neural Networks and Learning Systems.