Audition as a Trigger of Head Movements

In multimodal realistic environments, audition and vision are the prominent two sensory modalities that work together to provide humans with a best possible perceptual understanding of the environment. Yet, when designing artificial binaural systems, this collaboration is often not honored. Instead, substantial effort is made to construct best performing purely auditory-scene-analysis systems, sometimes with goals and ambitions that reach beyond human capabilities. It is often not considered that, what enables us to perform so well in complex environments, is the ability of: (i) using more than one source of information, for instance, visual in addition to auditory one and, (ii) making assumptions about the objects to be perceived on the basis of a priori knowledge. In fact, the human capability of inferring information from one modality to another one helps substantially to efficiently analyze the complex environments that humans face everyday. Along this line of thinking, this chapter addresses the effects of attention reorientation triggered by audition. Accordingly, it discusses mechanisms that lead to appropriate motor reactions, such as head movements for putting our visual sensors toward an audiovisual object of interest. After presenting some of the neuronal foundations of multimodal integration and motor reactions linked to auditory-visual perception, some ideas and issues from the field of a robotics are tackled. This is accomplished by referring to computational modeling. Thereby some biological bases are discussed as underlie active multimodal perception, and it is demonstrated how these can be taken into account when designing artificial agents endowed with human-like perception.

[1]  Bruno Gas,et al.  Modulating the auditory turn-to reflex on the basis of multimodal feedback loops: The Dynamic Weighting model , 2015, 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[2]  Benjamin Cohen-Lhyver,et al.  Modulation de Mouvements de Tête pour l'Analyse Multimodale d'un Environnement Inconnu. (Head Turning Modulation for the Multimodal Analysis of Unknown Environments) , 2017 .

[3]  Pierre-Yves Oudeyer,et al.  R-IAC : Robust Intrinsically Motivated Active Learning , 2009 .

[4]  Pierre-Yves Oudeyer,et al.  Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Carles Escera,et al.  Attention capture by auditory significant stimuli: semantic analysis follows attention switching , 2003, The European journal of neuroscience.

[6]  Pierre-Yves Oudeyer,et al.  R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.

[7]  Steven van de Par,et al.  A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Philippe Gaussier,et al.  Neurobiologically Inspired Mobile Robot Navigation and Planning , 2007, Frontiers in neurorobotics.

[9]  M. Corbetta,et al.  The Reorienting System of the Human Brain: From Environment to Theory of Mind , 2008, Neuron.

[10]  Kevin Gurney,et al.  Comparing a brain-inspired robot action selection mechanism with winner-takes-all , 2002 .

[11]  L D Rosenblum,et al.  Visual influences on auditory pluck and bow judgments , 1993, Perception & psychophysics.

[12]  Sylvain Argentieri,et al.  The Head Turning Modulation System: An Active Multimodal Paradigm for Intrinsically Motivated Exploration of Unknown Environments , 2018, Front. Neurorobot..

[13]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[14]  Jens Blauert,et al.  Reflexive and Reflective Auditory Feedback , 2020 .

[15]  S. Shimojo,et al.  Visual illusion induced by sound. , 2002, Brain research. Cognitive brain research.

[16]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. I. A new functional anatomy , 2001, Biological Cybernetics.

[17]  Alexandre Bernardino,et al.  Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub , 2008, 2008 IEEE International Conference on Robotics and Automation.

[18]  P. May The mammalian superior colliculus: laminar structure and connections. , 2006, Progress in brain research.

[19]  D. Berlyne A theory of human curiosity. , 1954, British journal of psychology.

[20]  B. Stein,et al.  Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. , 1986, Journal of neurophysiology.

[21]  S. Hochstein,et al.  View from the Top Hierarchies and Reverse Hierarchies in the Visual System , 2002, Neuron.

[22]  David V. Anderson,et al.  Using auditory saliency to understand complex auditory scenes , 2007, 2007 15th European Signal Processing Conference.

[23]  J. Driver,et al.  Audiovisual links in endogenous covert spatial attention. , 1996, Journal of experimental psychology. Human perception and performance.

[24]  Joseph W. Hall,et al.  Detection in noise by spectro-temporal pattern analysis. , 1984, The Journal of the Acoustical Society of America.

[25]  R. Zatorre,et al.  Voice-selective areas in human auditory cortex , 2000, Nature.

[26]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[27]  William A. Yost,et al.  Auditory Perception and Sound Source Determination , 1992 .

[28]  Adrian K. C. Lee,et al.  Integration of Visual Information in Auditory Cortex Promotes Auditory Scene Analysis through Multisensory Binding , 2017, Neuron.

[29]  S. Shimojo,et al.  Sound alters visual evoked potentials in humans , 2001, Neuroreport.

[30]  G H MOWBRAY,et al.  On discriminating the rate of visual flicker and auditory flutter. , 1959, The American journal of psychology.

[31]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[32]  Mowbray Gh,et al.  On discriminating the rate of visual flicker and auditory flutter. , 1959 .

[33]  A. King,et al.  Multisensory integration. , 1993, Science.

[34]  Giuliano Iurilli,et al.  Sound-Driven Synaptic Inhibition in Primary Visual Cortex , 2012, Neuron.

[35]  Ladan Shams,et al.  Early modulation of visual cortex by sound: an MEG study , 2005, Neuroscience Letters.

[36]  James W Bisley,et al.  Neural correlates of attention and distractibility in the lateral intraparietal area. , 2006, Journal of neurophysiology.

[37]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[38]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[39]  C. Grady,et al.  “What” and “where” in the human auditory system , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[40]  K. Alho,et al.  Generators of electrical and magnetic mismatch responses in humans , 2005, Brain Topography.

[41]  Ione Fine,et al.  Visual stimuli activate auditory cortex in the deaf , 2001, Nature Neuroscience.

[42]  T. Kohonen Self-Organized Formation of Correct Feature Maps , 1982 .

[43]  Christopher Schymura,et al.  Blackboard Systems for Cognitive Audition , 2020 .

[44]  H. Nothdurft,et al.  Salience and target selection in visual search , 2006 .

[45]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[46]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour , 2001, Biological Cybernetics.

[47]  C. Spence,et al.  On measuring selective attention to an expected sensory modality , 1997, Perception & psychophysics.

[48]  S. Denéve,et al.  Neural processing as causal inference , 2011, Current Opinion in Neurobiology.

[49]  J. Gallant,et al.  Goal-Related Activity in V4 during Free Viewing Visual Search Evidence for a Ventral Stream Visual Salience Map , 2003, Neuron.

[50]  J. Driver,et al.  Audiovisual links in exogenous covert spatial orienting , 1997, Perception & psychophysics.

[51]  S. Shamma On the Emergence and Awareness of Auditory Objects , 2008, PLoS biology.

[52]  R. Knight,et al.  Neural Mechanisms of Involuntary Attention to Acoustic Novelty and Change , 1998, Journal of Cognitive Neuroscience.

[53]  H. Pick,et al.  Visual capture produced by prism spectacles , 1965 .

[54]  David V. Anderson,et al.  Using auditory saliency to interpret complex auditory scenes , 2007 .

[55]  John J. Foxe,et al.  The neural circuitry of pre-attentive auditory change-detection: an fMRI study of pitch and duration mismatch negativity generators. , 2005, Cerebral cortex.

[56]  A. Cardoso,et al.  Modeling Forms of Surprise in an Artificial Agent , 2001 .

[57]  Alexei Makarenko,et al.  An experiment in integrated exploration , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[58]  M. Posner,et al.  Visual dominance: an information-processing account of its origins and significance. , 1976, Psychological review.

[59]  Zhaoping Li A saliency map in primary visual cortex , 2002, Trends in Cognitive Sciences.

[60]  Luc H. Arnal,et al.  Cortical oscillations and sensory predictions , 2012, Trends in Cognitive Sciences.

[61]  Lars Muckli,et al.  Decoding Sound and Imagery Content in Early Visual Cortex , 2014, Current Biology.

[62]  Yuki Suga,et al.  Multimodal integration learning of robot behavior using deep neural networks , 2014, Robotics Auton. Syst..

[63]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[64]  D. H. Warren,et al.  Immediate perceptual response to intersensory discrepancy. , 1980, Psychological bulletin.

[65]  N. P. Bichot,et al.  A visual salience map in the primate frontal eye field. , 2005, Progress in brain research.

[66]  Jon Driver,et al.  Covert Spatial Orienting in Audition: Exogenous and Endogenous Mechanisms , 1994 .

[67]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[68]  D. H. Warren,et al.  Sensory conflict in judgments of spatial direction , 1969 .

[69]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[70]  M. Ahissar,et al.  High-level and low-level processing in the auditory system: the role of primary auditory cortex , 2006 .

[71]  D. Berlyne NOVELTY AND CURIOSITY AS DETERMINANTS OF EXPLORATORY BEHAVIOUR1 , 1950 .

[72]  Alessandra Angelucci,et al.  Induction of visual orientation modules in auditory cortex , 2000, Nature.

[73]  S. Hochstein,et al.  The reverse hierarchy theory of visual perceptual learning , 2004, Trends in Cognitive Sciences.

[74]  E. C. Cherry Some Experiments on the Recognition of Speech, with One and with Two Ears , 1953 .

[75]  R. Näätänen,et al.  The mismatch negativity (MMN) in basic research of central auditory processing: A review , 2007, Clinical Neurophysiology.

[76]  W. K. Taylor,et al.  Some Further Experiments upon the Recognition of Speech, with One and with Two Ears , 1954 .

[77]  Adonis K Moschovakis,et al.  The superior colliculus and eye movement control , 1996, Current Opinion in Neurobiology.

[78]  Ross K. Maddox,et al.  Integration of visual information in auditory cortex promotes auditory scene analysis through multisensory binding , 2017 .

[79]  M. Ahissar,et al.  Low-Level Information and High-Level Perception: The Case of Speech in Noise , 2008, PLoS biology.

[80]  Thomas J. Anastasio,et al.  Using Bayes' Rule to Model Multisensory Enhancement in the Superior Colliculus , 2000, Neural Computation.

[81]  Giovanni Galfano,et al.  Nonspatial attentional shifts between audition and vision. , 2002, Journal of experimental psychology. Human perception and performance.

[82]  R Fendrich,et al.  The temporal cross-capture of audition and vision , 2001, Perception & psychophysics.

[83]  K. Alho Cerebral Generators of Mismatch Negativity (MMN) and Its Magnetic Counterpart (MMNm) Elicited by Sound Changes , 1995, Ear and hearing.

[84]  Hugh Durrant-Whyte,et al.  Simultaneous localization and mapping (SLAM): part II , 2006 .

[85]  Michael T. Lippert,et al.  Mechanisms for Allocating Auditory Attention: An Auditory Saliency Map , 2005, Current Biology.

[86]  T. Bonhoeffer,et al.  Current opinion in neurobiology , 1997, Current Opinion in Neurobiology.

[87]  R. Näätänen,et al.  Early selective-attention effect on evoked potential reinterpreted. , 1978, Acta psychologica.