Accurate Reconstruction of Image Stimuli From Human Functional Magnetic Resonance Imaging Based on the Decoding Model With Capsule Network Architecture

In neuroscience, all kinds of computation models were designed to answer the open question of how sensory stimuli are encoded by neurons and conversely, how sensory stimuli can be decoded from neuronal activities. Especially, functional Magnetic Resonance Imaging (fMRI) studies have made many great achievements with the rapid development of deep network computation. However, comparing with the goal of decoding orientation, position and object category from human fMRI in visual cortex, accurate reconstruction of image stimuli is a still challenging work. Current prevailing methods were composed of two independent steps, (1) decoding intermediate features from human fMRI and (2) reconstruction using the decoded intermediate features. The new concept of ‘capsule’ and ‘capsule’ based neural network were proposed recently. The ‘capsule’ represented a kind of structure containing a group of neurons to perform better feature representation. Especially, the high-level capsule’s features in the capsule network (CapsNet) contains various features of image stimuli such as semantic class, orientation, location, scale and so on, and these features can better represent the processed information inherited in the fMRI data collected in visual cortex. In this paper, a novel CapsNet architecture based visual reconstruction (CNAVR) computation model is developed to reconstruct image stimuli from human fMRI. The CNAVR is composed of linear encoding computation from capsule’s features to fMRI data and inverse reconstruction computation. In the first part, we trained the CapsNet model to obtain the non-linear mappings from images to high-level capsule’s features, and from high-level capsule’s features to images again in an end-to-end manner. In the second part, we trained the non-linear mapping from fMRI data of selected voxels to high-level capsule’s features. For a new image stimulus, we can use the method to predict the corresponding high-level capsule’s features using fMRI data, and reconstruct image stimuli with the trained reconstruction part in the CapsNet. We evaluated the proposed CNAVR method on the open dataset of handwritten digital images, and exceeded about 10% than the accuracy of all existing state-of-the-art methods on the structural similarity index (SSIM). In addition, we explained the selected voxels in specific interpretable image features to prove the effectivity and generalization of the CNAVR method.

[1]  Ryan J. Prenger,et al.  Bayesian Reconstruction of Natural Images from Human Brain Activity , 2009, Neuron.

[2]  Tom Heskes,et al.  Linear reconstruction of perceived images from human brain activity , 2013, NeuroImage.

[3]  Yizhen Zhang,et al.  Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision , 2016, Cerebral cortex.

[4]  Jean-Baptiste Poline,et al.  Inverse retinotopy: Inferring the visual content of images from brain activation patterns , 2006, NeuroImage.

[5]  Masa-aki Sato,et al.  Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders , 2008, Neuron.

[6]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[9]  J. Haynes Brain Reading: Decoding Mental States From Brain Activity In Humans , 2011 .

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Nikolaus Kriegeskorte Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015 .

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  F. Tong,et al.  Decoding the visual and subjective contents of the human brain , 2005, Nature Neuroscience.

[14]  Changde Du,et al.  Sharing deep generative representation for perceived image reconstruction from human brain activity , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[15]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[17]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[18]  David D. Cox,et al.  Functional magnetic resonance imaging (fMRI) “brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex , 2003, NeuroImage.

[19]  M. Just,et al.  Decoding the representation of numerical values from brain activation patterns , 2013, Human brain mapping.

[20]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[21]  J. Gallant,et al.  Identifying natural images from human brain activity , 2008, Nature.

[22]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[23]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[24]  G. Rees,et al.  Predicting the orientation of invisible stimuli from activity in human primary visual cortex , 2005, Nature Neuroscience.

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  D. Buxhoeveden,et al.  The minicolumn hypothesis in neuroscience. , 2002, Brain : a journal of neurology.

[28]  L. Shah,et al.  Functional magnetic resonance imaging. , 2010, Seminars in roentgenology.

[29]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[30]  Ha Hong,et al.  Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream , 2013, NIPS.

[31]  Tom Heskes,et al.  Neural Decoding with Hierarchical Generative Models , 2010, Neural Computation.

[32]  Graham W. Taylor,et al.  Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.

[33]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[34]  Jack L. Gallant,et al.  Encoding and decoding in fMRI , 2011, NeuroImage.

[35]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Gholam-Ali Hossein-Zadeh,et al.  Reconstruction of digit images from human brain fMRI activity through connectivity informed Bayesian networks , 2016, Journal of Neuroscience Methods.

[37]  Sean M. Polyn,et al.  Beyond mind-reading: multi-voxel pattern analysis of fMRI data , 2006, Trends in Cognitive Sciences.

[38]  Gholam-Ali Hossein-Zadeh,et al.  Decoding brain states using backward edge elimination and graph kernels in fMRI connectivity networks , 2013, Journal of Neuroscience Methods.

[39]  Yukiyasu Kamitani,et al.  Modular Encoding and Decoding Models Derived from Bayesian Canonical Correlation Analysis , 2013, Neural Computation.