论文信息 - Generating Natural Language Descriptions for Semantic Representations of Human Brain Activity

Generating Natural Language Descriptions for Semantic Representations of Human Brain Activity

Quantitative analysis of human brain activity based on language representations, such as the semantic categories of words, have been actively studied in the field of brain and neuroscience. Our study aims to generate natural language descriptions for human brain activation phenomena caused by visual stimulus by employing deep learning methods, which have gained interest as an effective approach to automatically describe natural language expressions for various type of multi-modal information, such as images. We employed an image-captioning system based on a deep learning framework as the basis for our method by learning the relationship between the brain activity data and the features of an intermediate expression of the deep neural network owing to lack of training brain data. We conducted three experiments and were able to generate natural language sentences which enabled us to quantitatively interpret brain activity.

[1] J. Gallant,et al. Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies , 2011, Current Biology.

[2] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[3] Yejin Choi,et al. TreeTalk: Composition and Compression of Trees for Image Descriptions , 2014, TACL.

[4] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5] Alexander G. Huth,et al. Attention During Natural Vision Warps Semantic Representation Across the Human Brain , 2013, Nature Neuroscience.

[6] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7] Aykut Erdem,et al. A Distributed Representation Based Query Expansion Approach for Image Captioning , 2015, ACL.

[8] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[9] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.

[10] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[12] Christopher Joseph Pal,et al. Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13] Y Kamitani,et al. Neural Decoding of Visual Imagery During Sleep , 2013, Science.

[14] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15] Frank Keller,et al. Image Description using Visual Dependency Representations , 2013, EMNLP.

[16] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.

[17] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[18] Ruslan Salakhutdinov,et al. Multimodal Neural Language Models , 2014, ICML.

[19] Francisco Pereira,et al. Using Wikipedia to learn semantic feature representations of concrete concepts in neuroimaging experiments , 2013, Artif. Intell..

[20] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[21] Sanja Fidler,et al. Order-Embeddings of Images and Language , 2015, ICLR.

[22] Tom Michael Mitchell,et al. Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.