Sparsity-Constrained fMRI Decoding of Visual Saliency in Naturalistic Video Streams

Naturalistic stimuli such as video watching have been increasingly used in functional magnetic resonance imaging (fMRI)-based brain encoding and decoding studies since they can provide real and dynamic information that the human brain has to process in everyday life. In this paper, we propose a sparsity-constrained decoding model to explore whether bottom-up visual saliency in continuous video streams can be effectively decoded by brain activity recorded by fMRI, and to examine whether sparsity constraints can improve visual saliency decoding. Specifically, we use a biologically-plausible computational model to quantify the visual saliency in video streams, and adopt a sparse representation algorithm to learn the atomic fMRI signal dictionaries that are representative of the patterns of whole-brain fMRI signals. Sparse representation also links the learned atomic dictionary with the quantified video saliency. Experimental results show that the temporal visual saliency in video stream can be well decoded and the sparse constraints can improve the performance of fMRI decoding models.

[1]  Junzhou Huang,et al.  Efficient MR image reconstruction for compressed MR imaging , 2011, Medical Image Anal..

[2]  Mikko Sams,et al.  Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm , 2012, NeuroImage.

[3]  Vangelis P. Oikonomou,et al.  A Sparse and Spatially Constrained Generative Regression Model for fMRI Data Analysis , 2012, IEEE Transactions on Biomedical Engineering.

[4]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[5]  S. Zeki,et al.  Functional brain mapping during free viewing of natural scenes , 2004, Human brain mapping.

[6]  Feng Wu,et al.  Background Prior-Based Salient Object Detection via Deep Reconstruction Residual , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[8]  N. Logothetis,et al.  Natural vision reveals regional specialization to local motion and to contrast-invariant, global flow in the human brain. , 2008, Cerebral cortex.

[9]  Jeffrey D Schall,et al.  On the role of frontal eye field in guiding attention and saccades , 2004, Vision Research.

[10]  Alan L. Yuille,et al.  Performance comparison of machine learning algorithms and number of independent components used in fMRI decoding of belief vs. disbelief , 2011, NeuroImage.

[11]  Saeid Sanei,et al.  Fast and incoherent dictionary learning algorithms with application to fMRI , 2015, Signal Image Video Process..

[12]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  V D Calhoun,et al.  Spatial and temporal independent component analysis of functional MRI data containing a pair of task‐related waveforms , 2001, Human brain mapping.

[14]  Lei Guo,et al.  An Object-Oriented Visual Saliency Detection Framework Based on Sparse Coding Representations , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Feng Qi Han,et al.  Rapid learning in cortical coding of visual scenes , 2007, Nature Neuroscience.

[16]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[17]  Vinoo Alluri,et al.  Capturing the musical brain with Lasso: Dynamic decoding of musical features from fMRI data , 2014, NeuroImage.

[18]  Sungho Tak,et al.  A Data-Driven Sparse GLM for fMRI Analysis Using Sparse Dictionary Learning With MDL Criterion , 2011, IEEE Transactions on Medical Imaging.

[19]  Zhaoping Li,et al.  Neural Activities in V1 Create a Bottom-Up Saliency Map , 2012, Neuron.

[20]  Junzhou Huang,et al.  Forest Sparsity for Multi-Channel Compressive Sensing , 2012, IEEE Transactions on Signal Processing.

[21]  Andreas Bartels,et al.  Brain dynamics during natural viewing conditions—A new guide for mapping connectivity in vivo , 2005, NeuroImage.

[22]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[23]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[24]  A. Ishai,et al.  Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex , 2001, Science.

[25]  Marc Leman,et al.  The Cortical Topography of Tonal Structures Underlying Western Music , 2002, Science.

[26]  Karl J. Friston,et al.  Statistical parametric maps in functional imaging: A general linear approach , 1994 .

[27]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[28]  Stephen M Smith,et al.  Correspondence of the brain's functional architecture during activation and rest , 2009, Proceedings of the National Academy of Sciences.

[29]  J. Bisley The neural basis of visual attention , 2011, The Journal of physiology.

[30]  N. Kanwisher,et al.  Neuroimaging of cognitive functions in human parietal cortex , 2001, Current Opinion in Neurobiology.

[31]  Jianfeng Feng,et al.  Voxel Selection in fMRI Data Analysis Based on Sparse Representation , 2009, IEEE Transactions on Biomedical Engineering.

[32]  Ling Shao,et al.  Specific object retrieval based on salient regions , 2006, Pattern Recognit..

[33]  Fraser W. Smith,et al.  Decoding Visual Object Categories in Early Somatosensory Cortex , 2013, Cerebral cortex.

[34]  Xin Zhang,et al.  Sparse Representation of Group-Wise FMRI Signals , 2013, MICCAI.

[35]  Stephen M. Smith,et al.  Probabilistic independent component analysis for functional magnetic resonance imaging , 2004, IEEE Transactions on Medical Imaging.

[36]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[37]  Xian-Sheng Hua,et al.  Bridging the Semantic Gap via Functional Brain Imaging , 2012, IEEE Transactions on Multimedia.

[38]  C. F. Beckmann,et al.  Tensorial extensions of independent component analysis for multisubject FMRI analysis , 2005, NeuroImage.

[39]  Masa-aki Sato,et al.  Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns , 2008, NeuroImage.

[40]  Michael T. Lippert,et al.  Mechanisms for Allocating Auditory Attention: An Auditory Saliency Map , 2005, Current Biology.

[41]  Ke Huang,et al.  Sparse Representation for Signal Classification , 2006, NIPS.

[42]  Sean M. Polyn,et al.  Beyond mind-reading: multi-voxel pattern analysis of fMRI data , 2006, Trends in Cognitive Sciences.

[43]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[44]  J. Gallant,et al.  Reconstructing Visual Experiences from Brain Activity Evoked by Natural Movies , 2011, Current Biology.

[45]  Kaustubh Supekar,et al.  Sparse logistic regression for whole-brain classification of fMRI data , 2010, NeuroImage.

[46]  Tom M. Mitchell,et al.  Learning to Decode Cognitive States from Brain Images , 2004, Machine Learning.

[47]  J. Pekar,et al.  A method for making group inferences from functional MRI data using independent component analysis , 2001, Human brain mapping.

[48]  Markus Junghöfer,et al.  Selective Visual Attention to Emotion , 2007, The Journal of Neuroscience.

[49]  Heikki Huttunen,et al.  Mind reading with regularized multinomial logistic regression , 2012, Machine Vision and Applications.

[50]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[51]  Andreas Bartels,et al.  The chronoarchitecture of the human brain—natural viewing conditions reveal a time-based anatomy of the brain , 2004, NeuroImage.

[52]  Jack L. Gallant,et al.  Encoding and decoding in fMRI , 2011, NeuroImage.

[53]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[54]  Samuel Kaski,et al.  Dependencies between stimuli and spatially independent fMRI sources: Towards brain correlates of natural stimuli , 2009, NeuroImage.

[55]  D. Heeger,et al.  Reliability of cortical activity during natural stimulation , 2010, Trends in Cognitive Sciences.

[56]  Daniel D. Lee,et al.  Bayesian L1-Norm Sparse Learning , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[57]  Emiliano Macaluso,et al.  Sensory processing during viewing of cinematographic material: Computational modeling and functional neuroimaging , 2013, NeuroImage.

[58]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[59]  Rafael Malach,et al.  Extrinsic and intrinsic systems in the posterior cortex of the human brain revealed during natural sensory stimulation. , 2007, Cerebral cortex.

[60]  Ling Shao,et al.  Targeting Accurate Object Extraction From an Image: A Comprehensive Study of Natural Image Matting , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[61]  J. Haynes,et al.  Decoding Successive Computational Stages of Saliency Processing , 2011, Current Biology.

[62]  S Makeig,et al.  Analysis of fMRI data by blind separation into independent spatial components , 1998, Human brain mapping.

[63]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[64]  I Daubechies,et al.  Independent component analysis for brain fMRI does not select for independence , 2009 .

[65]  Sotirios A. Tsaftaris,et al.  Medical Image Computing and Computer Assisted Intervention , 2017 .

[66]  L. Davachi,et al.  Enhanced Intersubject Correlations during Movie Viewing Correlate with Successful Episodic Encoding , 2008, Neuron.

[67]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[68]  Mo Chen,et al.  Merging Neuroimaging and Multimedia: Methods, Opportunities, and Challenges , 2014, IEEE Transactions on Human-Machine Systems.

[69]  Jean-Baptiste Poline,et al.  A Novel Sparse Graphical Approach for Multimodal Brain Connectivity Inference , 2012, MICCAI.

[70]  R. Malach,et al.  Intersubject Synchronization of Cortical Activity During Natural Vision , 2004, Science.

[71]  E. Macaluso,et al.  Stimulus-Driven Orienting of Visuo-Spatial Attention in Complex Dynamic Environments , 2011, Neuron.