A computational model of crossmodal processing for conflict resolution

The brain integrates information from multiple sensory modalities to form a coherent and robust perceptual experience in complex environments. This ability is progressively acquired and fine-tuned during developmental stages in a multisensory environment. A rich set of neural mechanisms supports the integration and segregation of multimodal stimuli, providing the means to efficiently solve conflicts across modalities. Therefore, there is the motivation to develop efficient mechanisms for robotic platforms that process multisensory signals and trigger robust sensory-driven motor behavior. In this paper, we implement a computational model of crossmodal integration in a sound source localization task that accounts also for audiovisual conflict resolution. Our model consists of two layers of reciprocally connected visual and auditory neurons and a layer with crossmodal neurons that learns to integrate (or segregate) audiovisual stimuli on the basis of spatial disparity. To validate our architecture, we propose a spatial localization task in which 30 subjects had to determine the location of the sound source in a virtual scenario with four animated avatars. We measured their accuracy and reaction time under different conditions for congruent and incongruent audiovisual stimuli. We used this study as a baseline to model human-like behavioral responses with a neural network architecture exposed to the same experimental conditions.

[1]  Mauro Ursino,et al.  A Neural Network Model of Ventriloquism Effect and Aftereffect , 2012, PloS one.

[2]  Nadia Bolognini,et al.  A neurocomputational analysis of the sound-induced flash illusion , 2014, NeuroImage.

[3]  Konrad Paul Kording,et al.  Causal Inference in Multisensory Perception , 2007, PloS one.

[4]  W R Thurlow,et al.  Effects of degree of visual association and angle of displacement on the "ventriloquism" effect. , 1973, Perceptual and motor skills.

[5]  T. Hensch Critical period plasticity in local cortical circuits , 2005, Nature Reviews Neuroscience.

[6]  Yuki Suga,et al.  Multimodal integration learning of robot behavior using deep neural networks , 2014, Robotics Auton. Syst..

[7]  B. Stein,et al.  The Merging of the Senses , 1993 .

[8]  Chris I. Baker,et al.  Integration of Visual and Auditory Information by Superior Temporal Sulcus Neurons Responsive to the Sight of Actions , 2005, Journal of Cognitive Neuroscience.

[9]  M. Wallace,et al.  Visual Localization Ability Influences Cross-Modal Bias , 2003, Journal of Cognitive Neuroscience.

[10]  Weizhi Nan,et al.  Independent Processing of Stimulus-Stimulus and Stimulus-Response Conflicts , 2014, PloS one.

[11]  Ladan Shams,et al.  Biases in Visual, Auditory, and Audiovisual Perception of Space , 2015, PLoS Comput. Biol..

[12]  Pingyan Zhou,et al.  Attentional Modulation of Emotional Conflict Processing with Flanker Tasks , 2013, PloS one.

[13]  Paul Bertelson,et al.  The aftereffects of ventriloquism: Patterns of spatial generalization , 2006, Perception & psychophysics.

[14]  Christoph Kayser,et al.  Multisensory Causal Inference in the Brain , 2015, PLoS biology.

[15]  P. Bertelson,et al.  Adaptation to auditory-visual discordance and ventriloquism in semirealistic situations , 1977 .

[16]  Mauro Ursino,et al.  Neurocomputational approaches to modelling multisensory integration in the brain: A review , 2014, Neural Networks.

[17]  Si Wu,et al.  "Congruent" and "Opposite" Neurons: Sisters for Multisensory Integration and Segregation , 2016, NIPS.

[18]  Jin Fan,et al.  Dimensional overlap accounts for independence and integration of stimulus—response compatibility effects , 2010, Attention, perception & psychophysics.

[19]  Mauro Ursino,et al.  Multisensory Bayesian Inference Depends on Synapse Maturation during Training: Theoretical Analysis and Neural Modeling Implementation , 2017, Neural Computation.

[20]  Stefan Wermter,et al.  Multi-modal integration of dynamic audiovisual patterns for an interactive reinforcement learning scenario , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Ilana B. Witten,et al.  Why Seeing Is Believing: Merging Auditory and Visual Worlds , 2005, Neuron.