Combining Citizen Science and Deep Learning to Amplify Expertise in Neuroimaging

Research in many fields has become increasingly reliant on large and complex datasets. “Big Data” holds untold promise to rapidly advance science by tackling new questions that cannot be answered with smaller datasets. While powerful, research with Big Data poses unique challenges, as many standard lab protocols rely on experts examining each one of the samples. This is not feasible for large-scale datasets because manual approaches are time-consuming and hence difficult to scale. Meanwhile, automated approaches lack the accuracy of examination by highly trained scientists and this may introduce major errors, sources of noise, and unforeseen biases into these large and complex datasets. Our proposed solution is to 1) start with a small, expertly labelled dataset, 2) amplify labels through web-based tools that engage citizen scientists, and 3) train machine learning on amplified labels to emulate expert decision making. As a proof of concept, we developed a system to quality control a large dataset of three-dimensional magnetic resonance images (MRI) of human brains. An initial dataset of 200 brain images labeled by experts were amplified by citizen scientists to label 722 brains, with over 80,000 ratings done through a simple web interface. A deep learning algorithm was then trained to predict data quality, based on a combination of the citizen scientist labels that accounts for differences in the quality of classification by different citizen scientists. In an ROC analysis (on left out test data), the deep learning network performed as well as a state-of-the-art, specialized algorithm (MRIQC) for quality control of T1-weighted images, each with an area under the curve of 0.99. Finally, as a specific practical application of the method, we explore how brain image quality relates to the replicability of a well established relationship between brain volume and age over development. Combining citizen science and deep learning can generalize and scale expert decision making; this is particularly important in emerging disciplines where specialized, automated tools do not already exist.

[1]  Bruce Fischl,et al.  FreeSurfer , 2012, NeuroImage.

[2]  David H. Brainard,et al.  Correction of Distortion in Flattened Representations of the Cortical Surface Allows Prediction of V1-V3 Functional Organization from Anatomy , 2014, PLoS Comput. Biol..

[3]  Alun D. Preece,et al.  Interpretability of deep learning models: A survey of results , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[4]  Ludovica Griffanti,et al.  Hand classification of fMRI ICA noise components , 2017, NeuroImage.

[5]  Nico Papinutto,et al.  Investigating the Functional Consequence of White Matter Damage: An Automatic Pipeline to Create Longitudinal Disconnection Tractograms , 2017, bioRxiv.

[6]  Jonathan Winawer,et al.  Imaging retinotopic maps in the human brain , 2011, Vision Research.

[7]  Han Liu,et al.  Challenges of Big Data Analysis. , 2013, National science review.

[8]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Alan C. Evans,et al.  LORIS: a web-based data management system for multi-center studies , 2012, Front. Neuroinform..

[10]  Krzysztof J. Gorgolewski,et al.  Making big data open: data sharing in neuroimaging , 2014, Nature Neuroscience.

[11]  Zoran Popović,et al.  Power to the People: Addressing Big Data Challenges in Neuroscience by Creating a New Cadre of Citizen Neuroscientists , 2016, Neuron.

[12]  Steen Moeller,et al.  The Human Connectome Project's neuroimaging approach , 2016, Nature Neuroscience.

[13]  Yihong Gong,et al.  Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks , 2008, ECCV.

[14]  Nico Karssemeijer,et al.  Using deep learning to segment breast and fibroglandular tissue in MRI volumes , 2017, Medical physics.

[15]  B. Wandell,et al.  Tract Profiles of White Matter Properties: Automating Fiber-Tract Quantification , 2012, PloS one.

[16]  Vivien Marx,et al.  Neuroscience waves to the crowd , 2013, Nature Methods.

[17]  David De Roure,et al.  Zooniverse: observing the world's largest citizen science platform , 2014, WWW.

[18]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Satrajit S. Ghosh,et al.  Mindboggling morphometry of human brains , 2016, bioRxiv.

[20]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[21]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[22]  Anisha Keshavan,et al.  Mindcontrol: A web application for brain segmentation quality control , 2016, NeuroImage.

[23]  Arno Klein,et al.  A reproducible evaluation of ANTs similarity metric performance in brain image registration , 2011, NeuroImage.

[24]  Omar H. Butt,et al.  The Retinotopic Organization of Striate Cortex Is Well Predicted by Surface Topology , 2012, Current Biology.

[25]  C. Lebel,et al.  Longitudinal Development of Human Brain Wiring Continues from Childhood into Adulthood , 2011, The Journal of Neuroscience.

[26]  I. Koerte,et al.  Diffusion Tensor Imaging , 2014 .

[27]  Kesshi M Jordan,et al.  Cluster Confidence Index: A Streamline‐Wise Pathway Reproducibility Metric for Diffusion‐Weighted MRI Tractography , 2018, Journal of neuroimaging : official journal of the American Society of Neuroimaging.

[28]  Adam R Ferguson,et al.  Big data from small data: data-sharing in the 'long tail' of neuroscience , 2014, Nature Neuroscience.

[29]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[30]  Karl J. Friston,et al.  Unified segmentation , 2005, NeuroImage.

[31]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[32]  Michael S. Bernstein,et al.  Break It Down: A Comparison of Macro- and Microtasks , 2015, CHI.

[33]  Thomas Brox,et al.  3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation , 2016, MICCAI.

[34]  Waltz,et al.  Descriptor : An open resource for transdiagnostic research in pediatric mental health and learning disorders , 2019 .

[35]  H P Chan,et al.  Image feature selection by a genetic algorithm: application to classification of mass and normal breast tissue. , 1996, Medical physics.

[36]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[37]  Alan C. Evans,et al.  Trajectories of cortical thickness maturation in normal brain development — The importance of quality control procedures , 2016, NeuroImage.

[38]  Anders M. Dale,et al.  The Adolescent Brain Cognitive Development (ABCD) study: Imaging acquisition across 21 sites , 2018, Developmental Cognitive Neuroscience.

[39]  Srinivas C. Turaga,et al.  Space-time wiring specificity supports direction selectivity in the retina , 2014, Nature.

[40]  Stephen M. Smith,et al.  A Bayesian model of shape and appearance for subcortical brain segmentation , 2011, NeuroImage.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Marleen de Bruijne,et al.  Transfer Learning Improves Supervised Image Segmentation Across Imaging Protocols , 2015, IEEE Trans. Medical Imaging.

[43]  Arthur W. Toga,et al.  Human neuroimaging as a “Big Data” science , 2013, Brain Imaging and Behavior.

[44]  Satrajit S. Ghosh,et al.  BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods , 2016, bioRxiv.

[45]  M. Catani,et al.  A diffusion tensor imaging tractography atlas for virtual in vivo dissections , 2008, Cortex.

[46]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[47]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Krzysztof J. Gorgolewski,et al.  OpenNeuro – a free online platform for sharing and analysis of neuroimaging data , 2017 .

[49]  Jean-Philippe Thiran,et al.  Automatic quality assessment in structural brain magnetic resonance imaging , 2009, Magnetic resonance in medicine.

[50]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.

[51]  Yue Wu,et al.  Deep-Learning Based, Automated Segmentation of Macular Edema in Optical Coherence Tomography , 2017, bioRxiv.

[52]  Jonathan Winawer,et al.  Identification of the ventral occipital visual field maps in the human brain , 2016, bioRxiv.

[53]  Berkman Sahiner,et al.  Classification of mass and normal breast tissue: a convolution neural network classifier with spatial domain and texture images , 1996, IEEE Trans. Medical Imaging.

[54]  Satrajit S. Ghosh,et al.  The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments , 2016, Scientific Data.

[55]  Satrajit S. Ghosh,et al.  Open Neuroimaging Laboratory , 2016 .

[56]  Aaron Y. Lee,et al.  Deep learning is effective for the classification of OCT images of normal versus Age-related Macular Degeneration , 2016, bioRxiv.

[57]  Krzysztof J. Gorgolewski,et al.  MRIQC: Advancing the automatic prediction of image quality in MRI from unseen sites , 2016, bioRxiv.