Multiple Instance Neuroimage Transformer

For the first time, we propose using a multiple instance learning based convolution-free transformer model, called Multiple Instance Neuroimage Transformer (MINiT), for the classification of T1-weighted (T1w) MRIs. We first present several variants of transformer models adopted for neuroimages. These models extract non-overlapping 3D blocks from the input volume and perform multi-headed self-attention on a sequence of their linear projections. MINiT, on the other hand, treats each of the non-overlapping 3D blocks of the input MRI as its own instance, splitting it further into non-overlapping 3D patches, on which multi-headed self-attention is computed. As a proof-of-concept, we evaluate the efficacy of our model by training it to identify sex from T1w-MRIs of two public datasets: Adolescent Brain Cognitive Development (ABCD) and the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA). The learned attention maps highlight voxels contributing to identifying sex differences in brain morphometry. The code is available at https://github.com/singlaayush/MINIT.

[1]  Andrew M. Dai,et al.  Co-training Transformer with Videos and Images Improves Action Recognition , 2021, ArXiv.

[2]  Jakob Uszkoreit,et al.  How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers , 2021, Trans. Mach. Learn. Res..

[3]  Cho-Jui Hsieh,et al.  When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations , 2021, ICLR.

[4]  Heung-Il Suk,et al.  Medical Transformer: Universal Brain Encoder for 3D MRI Analysis , 2021, ArXiv.

[5]  Jianlin Su,et al.  RoFormer: Enhanced Transformer with Rotary Position Embedding , 2021, Neurocomputing.

[6]  Cordelia Schmid,et al.  ViViT: A Video Vision Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Enhua Wu,et al.  Transformer in Transformer , 2021, NeurIPS.

[8]  Yan Wang,et al.  TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation , 2021, ArXiv.

[9]  A. Yuille,et al.  MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Klaus Dietmayer,et al.  Point Transformer , 2020, IEEE Access.

[11]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[12]  A. Goldstone,et al.  Deep learning identifies morphological determinants of sex differences in the pre-adolescent brain , 2020, NeuroImage.

[13]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[14]  Diego H. Milone,et al.  Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis , 2020, Proceedings of the National Academy of Sciences.

[15]  Willem Zuidema,et al.  Quantifying Attention Flow in Transformers , 2020, ACL.

[16]  A. Pfefferbaum,et al.  Longitudinal Pooling & Consistency Regularization to Model Disease Progression From MRIs , 2020, IEEE Journal of Biomedical and Health Informatics.

[17]  Noam Shazeer,et al.  GLU Variants Improve Transformer , 2020, ArXiv.

[18]  Rizard Renanda Adhi Pramono,et al.  Hierarchical Self-Attention Network for Action Localization in Videos , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Kilian M. Pohl,et al.  Confounder-Aware Visualization of ConvNets , 2019, MLMI@MICCAI.

[20]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Yaoxue Zhang,et al.  Brain Differences Between Men and Women: Evidence From Deep Learning , 2019, Front. Neurosci..

[22]  Theodore D. Satterthwaite,et al.  Sex differences in the developing brain: insights from multimodal neuroimaging , 2018, Neuropsychopharmacology.

[23]  Abien Fred Agarap Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.

[24]  Anders M. Dale,et al.  The Adolescent Brain Cognitive Development (ABCD) study: Imaging acquisition across 21 sites , 2018, Developmental Cognitive Neuroscience.

[25]  M. Arns,et al.  Predicting sex from brain rhythms with deep learning , 2018, Scientific Reports.

[26]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[27]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[28]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[29]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[30]  Eric Granger,et al.  Multiple instance learning: A survey of problem characteristics and applications , 2016, Pattern Recognit..

[31]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[32]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Torsten Rohlfing,et al.  The National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA): A Multisite Study of Adolescent Development and Substance Use. , 2015, Journal of studies on alcohol and drugs.

[35]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  A. Villringer,et al.  Sexual dimorphism in the human brain: evidence from neuroimaging. , 2013, Magnetic resonance imaging.

[38]  Paul G. Spirakis,et al.  Weighted random sampling with a reservoir , 2006, Inf. Process. Lett..

[39]  Lior Wolf,et al.  Pre-training and Fine-tuning Transformers for fMRI Prediction Tasks , 2021, ArXiv.

[40]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[41]  Dinggang Shen,et al.  Landmark‐based deep multi‐instance learning for brain disease diagnosis , 2018, Medical Image Anal..

[42]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[43]  Andreas Buchmann,et al.  Sexual Dimorphism in the Parietal Substrate Associated with Visuospatial Cognition Independent of General Intelligence , 2010, Journal of Cognitive Neuroscience.