Limiting Factors for Mapping Corpus-Based Semantic Representations to Brain Activity

To help understand how semantic information is represented in the human brain, a number of previous studies have explored how a linear mapping from corpus derived semantic representations to corresponding patterns of fMRI brain activations can be learned. They have demonstrated that such a mapping for concrete nouns is able to predict brain activations with accuracy levels significantly above chance, but the more recent elaborations have achieved relatively little performance improvement over the original study. In fact, the absolute accuracies of all these models are still currently rather limited, and it is not clear which aspects of the approach need improving in order to achieve performance levels that might lead to better accounts of human capabilities. This paper presents a systematic series of computational experiments designed to identify the limiting factors of the approach. Two distinct series of artificial brain activation vectors with varying levels of noise are introduced to characterize how the brain activation data restricts performance, and improved corpus based semantic vectors are developed to determine how the word set and model inputs affect the results. These experiments lead to the conclusion that the current state-of-the-art input semantic representations are already operating nearly perfectly (at least for non-ambiguous concrete nouns), and that it is primarily the quality of the fMRI data that is limiting what can be achieved with this approach. The results allow the study to end with empirically informed suggestions about the best directions for future research in this area.

[1]  Francisco Pereira,et al.  Using Wikipedia to learn semantic feature representations of concrete concepts in neuroimaging experiments , 2013, Artif. Intell..

[2]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[3]  Han Liu,et al.  Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery , 2009, ICML '09.

[4]  Katrin Erk,et al.  What Is Word Meaning, Really? (And How Can Distributional Models Help Us Describe It?) , 2010 .

[5]  Anna Korhonen,et al.  Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora , 2010, HLT-NAACL 2010.

[6]  Tom Michael Mitchell,et al.  Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.

[7]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[8]  Shahram Khadivi,et al.  WordNet Based Features for Predicting Brain Activity associated with meanings of nouns , 2010, HLT-NAACL 2010.

[9]  John Caron,et al.  Experiments with LSA scoring: optimal rank and basis , 2001 .

[10]  Bradford Z. Mahon,et al.  Judging semantic similarity: an event-related fMRI study with auditory word stimuli , 2010, Neuroscience.

[11]  L. Tyler,et al.  Modulation of motor and premotor cortices by actions, action words and action sentences , 2009, Neuropsychologia.

[12]  J. Bullinaria Semantic Categorization Using Simple Word Co-occurrence Statistics , 2022 .

[13]  Rajeev D. S. Raizada,et al.  What Makes Different People's Representations Alike: Neural Similarity Space Solves the Problem of Across-subject fMRI Decoding , 2012, Journal of Cognitive Neuroscience.

[14]  Rutvik H. Desai,et al.  The neurobiology of semantic memory , 2011, Trends in Cognitive Sciences.

[15]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[16]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[17]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[18]  Lorraine K. Tyler,et al.  Objects and their actions: evidence for a neurally distributed semantic system , 2003, NeuroImage.

[19]  Tom M. Mitchell,et al.  Selecting Corpus-Semantic Models for Neurolinguistic Decoding , 2012, *SEMEVAL.

[20]  William W. Graves,et al.  Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. , 2009, Cerebral cortex.

[21]  Joseph P. Levy,et al.  USING ENRICHED SEMANTIC REPRESENTATIONS IN PREDICTIONS OF HUMAN BRAIN ACTIVITY , 2011 .

[22]  David D. Cox,et al.  Functional magnetic resonance imaging (fMRI) “brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex , 2003, NeuroImage.

[23]  Alfonso Caramazza,et al.  NEUROPSYCHOLOGICAL AND NEUROIMAGING PERSPECTIVES ON CONCEPTUAL KNOWLEDGE: AN INTRODUCTION , 2003, Cognitive neuropsychology.

[24]  Francisco Pereira,et al.  Learning semantic features for fMRI data from definitional text , 2010, HLT-NAACL 2010.

[25]  A. Ishai,et al.  Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex , 2001, Science.

[26]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[27]  Malti Patel,et al.  Extracting Semantic Representations from Large Text Corpora , 1997, NCPW.

[28]  S. Shinkareva,et al.  Neural representation of abstract and concrete concepts: A meta‐analysis of neuroimaging studies , 2010, Human brain mapping.

[29]  Tom M. Mitchell,et al.  Commonality of neural representations of words and pictures , 2011, NeuroImage.

[30]  G. Karypis,et al.  Criterion Functions for Document Clustering ∗ Experiments and Analysis , 2001 .

[31]  John A Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD , 2012, Behavior research methods.

[32]  Tom Michael Mitchell,et al.  A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes , 2010, PloS one.

[33]  Jing Wang,et al.  Decoding abstract and concrete concept representations based on single‐trial fMRI data , 2013, Human brain mapping.

[34]  Xi Chen,et al.  Adaptive Multi-task Sparse Learning with an Application to fMRI Study , 2012, SDM.