Simultaneous Neural Spike Encoding and Decoding Based on Cross-modal Dual Deep Generative Model

Neural encoding and decoding of retinal ganglion cells (RGCs) have been attached great importance in the research work of brain-machine interfaces. Much effort has been invested to mimic RGC and get insight into RGC signals to reconstruct stimuli. However, there remain two challenges. On the one hand, complex nonlinear processes in retinal neural circuits hinder encoding models from enhancing their ability to fit the natural stimuli and modelling RGCs accurately. On the other hand, current research of the decoding process is separate from that of the encoding process, in which the liaison of mutual promotion between them is neglected. In order to alleviate the above problems, we propose a cross-modal dual deep generative model (CDDG) in this paper. CDDG treats the RGC spike signals and the stimuli as two modalities, which learns a shared latent representation for the concatenated modality and two modal-specific latent representations. Then, it imposes distribution consistency restriction on different latent space, cross-consistency and cycle-consistency constraints on the generated variables. Thus, our model ensures cross-modal generation from RGC spike signals to stimuli and vice versa. In our framework, the generation from stimuli to RGC spike signals is equivalent to neural encoding while the inverse process is equivalent to neural decoding. Hence, the proposed method integrates neural encoding and decoding and exploits the reciprocity between them. The experimental results demonstrate that our proposed method can achieve excellent encoding and decoding performance compared with the state-of-the-art methods on three salamander RGC spike datasets with natural stimuli.

[1]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Changde Du,et al.  Reconstructing Perceived Images From Human Brain Activities With Bayesian Deep Multiview Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Luca Ambrogioni,et al.  Generative adversarial networks for reconstructing natural images from brain activity , 2017, NeuroImage.

[4]  Bhiksha Raj,et al.  Face Reconstruction from Voice using Generative Adversarial Networks , 2019, NeurIPS.

[5]  Yiannis Demiris,et al.  Variational Autoencoded Regression: High Dimensional Regression of Visual Data on Complex Manifold , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[7]  Changde Du,et al.  Sharing deep generative representation for perceived image reconstruction from human brain activity , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[8]  Liang Xiao,et al.  Perceptual image quality assessment based on structural similarity and visual masking , 2012, Signal Process. Image Commun..

[9]  Tiejun Huang,et al.  Reconstruction of Natural Visual Scenes from Neural Spikes with Deep Neural Networks , 2019, Neural Networks.

[10]  Georg Martius,et al.  Nonlinear decoding of a complex movie from the mammalian retina , 2016, PLoS Comput. Biol..

[11]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[13]  Jack L. Gallant,et al.  Encoding and decoding in fMRI , 2011, NeuroImage.

[14]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[15]  Chenliang Xu,et al.  Deep Cross-Modal Audio-Visual Generation , 2017, ACM Multimedia.

[16]  Chen Fang,et al.  Visual to Sound: Generating Natural Sound for Videos in the Wild , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Ling Shao,et al.  Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval , 2018, IEEE Transactions on Image Processing.

[18]  Stefano Panzeri,et al.  Using Matrix and Tensor Factorizations for the Single-Trial Analysis of Population Spike Trains , 2016, PLoS Comput. Biol..

[19]  Thitirat Siriborvornratanakul Through the Realities of Augmented Reality , 2019, HCI.

[20]  Maneesh Sahani,et al.  Models of Neuronal Stimulus-Response Functions: Elaboration, Estimation, and Evaluation , 2017, Front. Syst. Neurosci..

[21]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[22]  Andrea Vedaldi,et al.  It Takes (Only) Two: Adversarial Generator-Encoder Networks , 2017, AAAI.

[23]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  T. Martin McGinnity,et al.  Computational modelling of salamander retinal ganglion cells using machine learning approaches , 2019, Neurocomputing.

[25]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[26]  Liam Paninski,et al.  Multilayer Recurrent Network Models of Primate Retinal Ganglion Cell Responses , 2016, ICLR.

[27]  Masahiro Suzuki,et al.  Joint Multimodal Learning with Deep Generative Models , 2016, ICLR.

[28]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[29]  Surya Ganguli,et al.  Deep Learning Models of the Retinal Response to Natural Scenes , 2017, NIPS.

[30]  Yichen Zhang,et al.  Towards the Next Generation of Retinal Neuroprosthesis: Visual Computation with Spikes. , 2020, 2001.04064.

[31]  G B Stanley,et al.  Reconstruction of Natural Scenes from Ensemble Responses in the Lateral Geniculate Nucleus , 1999, The Journal of Neuroscience.

[32]  Masa-aki Sato,et al.  Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders , 2008, Neuron.

[33]  Frank Tong,et al.  Attention alters orientation processing in the human lateral geniculate nucleus , 2015, Nature Neuroscience.

[34]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[35]  Tiejun Huang,et al.  Revealing Fine Structures of the Retinal Receptive Field by Deep Learning Networks , 2020, IEEE transactions on cybernetics.

[36]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[37]  Liam Paninski,et al.  Neural Networks for Efficient Bayesian Decoding of Natural Images from Retinal Neurons , 2017, bioRxiv.