Generative adversarial networks for reconstructing natural images from brain activity

ABSTRACT We explore a method for reconstructing visual stimuli from brain activity. Using large databases of natural images we trained a deep convolutional generative adversarial network capable of generating gray scale photos, similar to stimuli presented during two functional magnetic resonance imaging experiments. Using a linear model we learned to predict the generative model's latent space from measured brain activity. The objective was to create an image similar to the presented stimulus image through the previously trained generator. Using this approach we were able to reconstruct structural and some semantic features of a proportion of the natural images sets. A behavioural test showed that subjects were capable of identifying a reconstruction of the original stimulus in 67.2% and 66.4% of the cases in a pairwise comparison for the two natural image datasets respectively. Our approach does not require end‐to‐end training of a large generative model on limited neuroimaging data. Rapid advances in generative modeling promise further improvements in reconstruction performance. HIGHLIGHTSA generative adversarial network (DCGAN) is used for reconstructing visual percepts.Minimizing image loss, a linear model learns to predict the latent space from BOLD.With a GAN limited to 6 handwritten characters, detailed features can be retrieved.Reconstructions of arbitrary natural images are identifiable by human raters.The specific GAN is a component and replaceable by advanced deterministic generators.

[1]  J. Haynes A Primer on Pattern-Based Approaches to fMRI: Principles, Pitfalls, and Perspectives , 2015, Neuron.

[2]  Marcel A. J. van Gerven Unsupervised Learning of Features for Bayesian Decoding in Functional Magnetic Resonance Imaging , 2013 .

[3]  A. Smeulders,et al.  A Physical Explanation for Natural Image Statistics , 2002 .

[4]  J. Gallant,et al.  Identifying natural images from human brain activity , 2008, Nature.

[5]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[6]  A. Ishai,et al.  Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex , 2001, Science.

[7]  Jack L. Gallant,et al.  Decoding the Semantic Content of Natural Movies from Human Brain Activity , 2016, Frontiers in systems neuroscience.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Changde Du,et al.  Sharing deep generative representation for perceived image reconstruction from human brain activity , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[10]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[11]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[13]  Tomoyasu Horikawa,et al.  Generic decoding of seen and imagined objects using hierarchical visual features , 2015, Nature Communications.

[14]  Renjie Liao,et al.  Learning to generate images with perceptual similarity metrics , 2015, 2017 IEEE International Conference on Image Processing (ICIP).

[15]  Masa-aki Sato,et al.  Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders , 2008, Neuron.

[16]  G B Stanley,et al.  Reconstruction of Natural Scenes from Ensemble Responses in the Lateral Geniculate Nucleus , 1999, The Journal of Neuroscience.

[17]  Tom Heskes,et al.  Gaussian mixture models and semantic gating improve reconstructions from human brain activity , 2015, Front. Comput. Neurosci..

[18]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[19]  Frank Hutter,et al.  A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets , 2017, ArXiv.

[20]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[21]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[22]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[23]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[24]  J. Gallant,et al.  Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies , 2011, Current Biology.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Marcel van Gerven,et al.  Reconstructing perceived faces from brain activations with deep adversarial neural decoding , 2017, NIPS.

[27]  Bryan R. Conroy,et al.  A Common, High-Dimensional Model of the Representational Space in Human Ventral Temporal Cortex , 2011, Neuron.

[28]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[29]  F. Tong,et al.  Decoding the visual and subjective contents of the human brain , 2005, Nature Neuroscience.

[30]  Arnold W. M. Smeulders,et al.  Brain responses strongly correlate with Weibull image statistics when processing natural images. , 2009, Journal of vision.

[31]  Hyunjung Shim,et al.  MGGAN: Solving Mode Collapse Using Manifold-Guided Training , 2018, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[32]  Liam Paninski,et al.  Neural Networks for Efficient Bayesian Decoding of Natural Images from Retinal Neurons , 2017, bioRxiv.

[33]  Eero P. Simoncelli Modeling the joint statistics of images in the wavelet domain , 1999, Optics & Photonics.

[34]  Marcel van Gerven,et al.  Increasingly complex representations of natural movies across the dorsal stream are shared between subjects , 2017, NeuroImage.

[35]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[36]  Louis Vuurpijl,et al.  Forensic writer identification: a benchmark data set and a comparison of two systems , 2000 .

[37]  Laurens van der Maaten,et al.  A New Benchmark Dataset for Handwritten Character Recognition , 2009 .

[38]  Stefan Pollmann,et al.  PyMVPA: a Python Toolbox for Multivariate Pattern Analysis of fMRI Data , 2009, Neuroinformatics.

[39]  Ryan J. Prenger,et al.  Bayesian Reconstruction of Natural Images from Human Brain Activity , 2009, Neuron.

[40]  Tom Heskes,et al.  Linear reconstruction of perceived images from human brain activity , 2013, NeuroImage.

[41]  Jean-Baptiste Poline,et al.  Inverse retinotopy: Inferring the visual content of images from brain activation patterns , 2006, NeuroImage.

[42]  Sepp Hochreiter,et al.  Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields , 2017, ICLR.

[43]  Doris Y. Tsao,et al.  The Code for Facial Identity in the Primate Brain , 2017, Cell.

[44]  Tom White,et al.  Generative Adversarial Networks: An Overview , 2017, IEEE Signal Processing Magazine.