Modulation de Mouvements de Tête pour l'Analyse Multimodale d'un Environnement Inconnu. (Head Turning Modulation for the Multimodal Analysis of Unknown Environments)

L'exploration d'un environnement inconnu par un robot mobile est un vaste domaine de recherche visant a comprendre et implementer des modeles d'exploration efficaces, rapides et pertinents. Cependant, depuis les annees 80, l'exploration ne s'est plus contentee de la seule determination de la topographie d'un espace : a la composante spatiale a ete couplee une composante semantique du monde explore. En effet, en addition aux caracteristiques physiques de l'environnement — murs, obstacles, chemins empruntables ou non, entrees et sorties — permettant au robot de se creer une representation interne du monde grâce a laquelle il peut s'y deplacer, existent des caracteristiques dynamiques telles que l'apparition d'evenements audiovisuels. Ces evenements sont d'une grande importance en cela qu'ils peuvent moduler le comportement du robot en fonction de leur localisation dans l'espace — aspect topographique — et de l'information qu'ils portent — aspect semantique. Bien qu'impredictibles par nature (puisque l'environnement est inconnu) tous ces evenements ne sont pas d'egale importance : certains peuvent porter une information utile au robot et a sa tâche d'exploration, d'autres non. Suivant les travaux sur les motivations intrinseques a explorer un environnement inconnu et puisant son inspiration de phenomenes neurologiques, ce travail de these a consiste en l'elaboration du modele Head Turning Modulation (HTM) visant a donner a un robot dote de mouvements de tete la capacite de determiner l'importance relative de l'apparition d'un evenement audiovisuel dans un environnement inconnu en cours d'exploration. Cette importance a ete formalisee sous la forme de la notion de Congruence s'inspirant principalement (i) de l'entropie de Shannon, (ii) du phenomene de Mismatch Negativity et (iii) de la Reverse Hierarchy Theory. Le modele HTM, cree dans le cadre du projet europeen Two!Ears, est un paradigme d'apprentissage base sur (i) une auto-supervision (le robot decide lorsqu'il est necessaire d'apprendre ou non), (ii) une contrainte de temps reel (le robot apprend et reagit aussitot que des donnees sont percues), et (iii) une absence de donnees a priori sur l'environnement (il n'existe pas de verite a apprendre, seulement la realite percue de l'environnement a explorer). Ce modele, integre a l’ensemble du framework Two!Ears, a ete entierement porte sur un robot mobile pourvu d'une vision binoculaire et d'une audition binaurale. Le modele HTM couple ainsi une approche montante traditionnelle d’analyse des signaux perceptifs (extractions de caracteristiques, reconnaissance visuelle ou auditive, etc.) a une approche descendante permettant, via la generation d’une action motrice, de comprendre et interpreter l’environnement audiovisuel du robot. Cette approche bottom-up/top-down active est ainsi exploitee pour moduler les mouvements de tete d’un robot humanoide et etudier l'impact de la Congruence sur ces mouvements. Le systeme a ete evalue via des simulations realistes, ainsi que dans des conditions reelles, sur les deux plateformes robotiques du projet Two!Ears.

[1]  R. Paes,et al.  Field oriented control of a synchronous drive , 2005, IEEE International Conference on Electric Machines and Drives, 2005..

[2]  Antonio Chella,et al.  Machine consciousness: A manifesto for robotics , 2009 .

[3]  W Noble,et al.  The contribution of head motion cues to localization of low-pass noise , 1997, Perception & psychophysics.

[4]  S. Denéve,et al.  Neural processing as causal inference , 2011, Current Opinion in Neurobiology.

[5]  D. Berlyne NOVELTY AND CURIOSITY AS DETERMINANTS OF EXPLORATORY BEHAVIOUR1 , 1950 .

[6]  R. Muller,et al.  The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[7]  Verena V. Hafner,et al.  Cognitive Maps in Rats and Robots , 2005, Adapt. Behav..

[8]  Angelo Cangelosi,et al.  An open-source simulator for cognitive robotics research: the prototype of the iCub humanoid robot simulator , 2008, PerMIS.

[9]  M. Corbetta,et al.  Quantitative analysis of attention and detection signals during visual search. , 2003, Journal of neurophysiology.

[10]  Steven van de Par,et al.  A Binaural Scene Analyzer for Joint Localization and Recognition of Speakers in the Presence of Interfering Noise Sources and Reverberation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Peter Cheeseman,et al.  A stochastic map for uncertain spatial relationships , 1988 .

[12]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[13]  R. Näätänen,et al.  Auditory frequency discrimination and event-related potentials. , 1985, Electroencephalography and clinical neurophysiology.

[14]  E. Macaluso,et al.  Dissociation of stimulus relevance and saliency factors during shifts of visuospatial attention. , 2007, Cerebral cortex.

[15]  S. Hochstein,et al.  The reverse hierarchy theory of visual perceptual learning , 2004, Trends in Cognitive Sciences.

[16]  Peter Redgrave,et al.  Layered Control Architectures in Robots and Vertebrates , 1999, Adapt. Behav..

[17]  H. Pick,et al.  Visual capture produced by prism spectacles , 1965 .

[18]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Randall Smith,et al.  Estimating Uncertain Spatial Relationships in Robotics , 1987, Autonomous Robot Vehicles.

[20]  W. T. Thach,et al.  Basal ganglia intrinsic circuits and their role in behavior , 1993, Current Opinion in Neurobiology.

[21]  K. Alho,et al.  Generators of electrical and magnetic mismatch responses in humans , 2005, Brain Topography.

[22]  Philippe Gaussier,et al.  Neurobiologically Inspired Mobile Robot Navigation and Planning , 2007, Frontiers in neurorobotics.

[23]  L D Rosenblum,et al.  Visual influences on auditory pluck and bow judgments , 1993, Perception & psychophysics.

[24]  Philippe Gaussier,et al.  Transition Cells for Navigation and Planning in an Unknown Environment , 2006, SAB.

[25]  Josef Kittler,et al.  Experimental evaluation of expert fusion strategies , 1999, Pattern Recognit. Lett..

[26]  H. Nothdurft,et al.  Salience and target selection in visual search , 2006 .

[27]  W. Walter A Machine that Learns , 1951 .

[28]  Guy J. Brown,et al.  Robust localisation of multiple speakers exploiting head movements and multi-conditional training of binaural cues , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Nuttapong Chentanez,et al.  Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .

[30]  Chee Kheong Siew,et al.  Real-time learning capability of neural networks , 2006, IEEE Trans. Neural Networks.

[31]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[32]  C. Grady,et al.  “What” and “where” in the human auditory system , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Tim Brookes,et al.  Head Movements Made by Listeners in Experimental and Real-Life Listening Activities , 2013 .

[34]  C. Spence,et al.  On measuring selective attention to an expected sensory modality , 1997, Perception & psychophysics.

[35]  Victor R. Lesser,et al.  The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty , 1980, CSUR.

[36]  K. Kawamura,et al.  The Sensory Ego-Sphere as a Short-Term Memory for Humanoids , 2002 .

[37]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[39]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[40]  A. Cardoso,et al.  Modeling Forms of Surprise in an Artificial Agent , 2001 .

[41]  F L Wightman,et al.  Resolution of front-back ambiguity in spatial hearing by listener and source movement. , 1999, The Journal of the Acoustical Society of America.

[42]  R. Muller,et al.  The firing of hippocampal place cells in the dark depends on the rat's recent experience , 1990, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[43]  K. Alho Cerebral Generators of Mismatch Negativity (MMN) and Its Magnetic Counterpart (MMNm) Elicited by Sound Changes , 1995, Ear and hearing.

[44]  Hugh Durrant-Whyte,et al.  Simultaneous localization and mapping (SLAM): part II , 2006 .

[45]  Peter König,et al.  Integrating audiovisual information for the control of overt attention. , 2007, Journal of vision.

[46]  Giulio Sandini,et al.  The iCub humanoid robot: An open-systems platform for research in cognitive development , 2010, Neural Networks.

[47]  A. Berti,et al.  When Far Becomes Near: Remapping of Space by Tool Use , 2000, Journal of Cognitive Neuroscience.

[48]  David V. Anderson,et al.  Using auditory saliency to understand complex auditory scenes , 2007, 2007 15th European Signal Processing Conference.

[49]  Aggelos K. Katsaggelos,et al.  Audio-Visual Biometrics , 2006, Proceedings of the IEEE.

[50]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[51]  Alfred Ultsch,et al.  Self Organized Feature Maps for Monitoring and Knowledge Aquisition of a Chemical Process , 1993 .

[52]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[53]  James W Bisley,et al.  Neural correlates of attention and distractibility in the lateral intraparietal area. , 2006, Journal of neurophysiology.

[54]  Roy D. Patterson,et al.  A FUNCTIONAL MODEL OF NEURAL ACTIVITY PATTERNS AND AUDITORY IMAGES , 2004 .

[55]  D. M. Green,et al.  Sound localization by human listeners. , 1991, Annual review of psychology.

[56]  Torsten Dau,et al.  Computational speech segregation based on an auditory-inspired modulation analysis. , 2014, The Journal of the Acoustical Society of America.

[57]  Giovanni Galfano,et al.  Nonspatial attentional shifts between audition and vision. , 2002, Journal of experimental psychology. Human perception and performance.

[58]  J. Gibson,et al.  Journal of Experimental Psychology , 2022 .

[59]  Thippur V. Sreenivas,et al.  Codebook constrained Wiener filtering for speech enhancement , 1996, IEEE Trans. Speech Audio Process..

[60]  R. Passingham The hippocampus as a cognitive map J. O'Keefe & L. Nadel, Oxford University Press, Oxford (1978). 570 pp., £25.00 , 1979, Neuroscience.

[61]  Hynek Hermansky,et al.  Spectral entropy based feature for robust ASR , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[62]  M. Lassonde,et al.  Cross-modal plasticity for the spatial processing of sounds in visually deprived subjects , 2008, Experimental Brain Research.

[63]  Rainer Martin,et al.  Speech enhancement based on minimum mean-square error estimation and supergaussian priors , 2005, IEEE Transactions on Speech and Audio Processing.

[64]  R. Kesner,et al.  Involvement of the Prelimbic–Infralimbic Areas of the Rodent Prefrontal Cortex in Behavioral Flexibility for Place and Response Learning , 1999, The Journal of Neuroscience.

[65]  Joseph W. Hall,et al.  Detection in noise by spectro-temporal pattern analysis. , 1984, The Journal of the Acoustical Society of America.

[66]  M. Posner,et al.  Visual dominance: an information-processing account of its origins and significance. , 1976, Psychological review.

[67]  S. Shimojo,et al.  Sound alters visual evoked potentials in humans , 2001, Neuroreport.

[68]  Pierre Bessière,et al.  Auto-supervised learning in the Bayesian Programming Framework , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[69]  Zhaoping Li A saliency map in primary visual cortex , 2002, Trends in Cognitive Sciences.

[70]  P. Redgrave,et al.  The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.

[71]  Luc H. Arnal,et al.  Cortical oscillations and sensory predictions , 2012, Trends in Cognitive Sciences.

[72]  Birger Kollmeier,et al.  Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[73]  Jens Blauert,et al.  BINAURAL MODELS AND THEIR TECHNOLOGICAL APPLICATION , 2012 .

[74]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[75]  I. THE ATTENTION SYSTEM OF THE HUMAN BRAIN , 2002 .

[76]  Daniel Pressnitzer,et al.  Real-Time auditory Models , 2005, ICMC.

[77]  J. Driver,et al.  Audiovisual links in endogenous covert spatial attention. , 1996, Journal of experimental psychology. Human perception and performance.

[78]  James Llinas,et al.  An introduction to multisensor data fusion , 1997, Proc. IEEE.

[79]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[80]  B. Stein,et al.  Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. , 1986, Journal of neurophysiology.

[81]  Hadi Veisi,et al.  Speech enhancement using hidden Markov models in Mel-frequency domain , 2013, Speech Commun..

[82]  Yuki Suga,et al.  Multimodal integration learning of robot behavior using deep neural networks , 2014, Robotics Auton. Syst..

[83]  Sebastian Thrun,et al.  FastSLAM: a factored solution to the simultaneous localization and mapping problem , 2002, AAAI/IAAI.

[84]  H. Eichenbaum,et al.  Correlates of hippocampal complex-spike cell activity in rats performing a nonspatial radial maze task , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[85]  J. Gallant,et al.  Goal-Related Activity in V4 during Free Viewing Visual Search Evidence for a Ventral Stream Visual Salience Map , 2003, Neuron.

[86]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[87]  James Martens,et al.  Deep learning via Hessian-free optimization , 2010, ICML.

[88]  Vincent Lepetit,et al.  Gradient Response Maps for Real-Time Detection of Textureless Objects , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  T. Kohonen,et al.  Bibliography of Self-Organizing Map SOM) Papers: 1998-2001 Addendum , 2003 .

[90]  J. Driver,et al.  Audiovisual links in exogenous covert spatial orienting , 1997, Perception & psychophysics.

[91]  N. P. Bichot,et al.  A visual salience map in the primate frontal eye field. , 2005, Progress in brain research.

[92]  V. Braitenberg Vehicles, Experiments in Synthetic Psychology , 1984 .

[93]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[94]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[95]  D. H. Warren,et al.  Sensory conflict in judgments of spatial direction , 1969 .

[96]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[97]  J. Downar,et al.  A multimodal cortical network for the detection of changes in the sensory environment , 2000, Nature Neuroscience.

[98]  M. Ahissar,et al.  High-level and low-level processing in the auditory system: the role of primary auditory cortex , 2006 .

[99]  A. David Redish,et al.  The hippocampal debate: are we asking the right questions? , 2001, Behavioural Brain Research.

[100]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[101]  Bir Bhanu,et al.  Tracking Humans using Multi-modal Fusion , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[102]  C. Mead,et al.  Neuromorphic Robot Vision with Mixed Analog- Digital Architecture , 2005 .

[103]  G. Sandini,et al.  Impairment of auditory spatial localization in congenitally blind human subjects , 2013, Brain : a journal of neurology.

[104]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[105]  Kathleen E. Cullen,et al.  The vestibular system: multimodal integration and encoding of self-motion for motor control , 2012, Trends in Neurosciences.

[106]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[107]  H. Eichenbaum,et al.  The global record of memory in hippocampal neuronal activity , 1999, Nature.

[108]  P. E. Sharp,et al.  Simulation of spatial learning in the Morris water maze by a neural network model of the hippocampal formation and nucleus accumbens , 1995, Hippocampus.

[109]  Howard L Fields,et al.  Cue-evoked firing of nucleus accumbens neurons encodes motivational significance during a discriminative stimulus task. , 2004, Journal of neurophysiology.

[110]  A. Cardoso,et al.  Modeling Forms of Surprise in Artificial Agents: Empirical and Theoretical Study of Surprise Functions , 2004 .

[111]  E. Save,et al.  Coding for spatial goals in the prelimbic/infralimbic area of the rat frontal cortex. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[112]  Nicu Sebe,et al.  Multimodal Human Computer Interaction: A Survey , 2005, ICCV-HCI.

[113]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[114]  R U Muller,et al.  Head-direction cells recorded from the postsubiculum in freely moving rats. I. Description and quantitative analysis , 1990, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[115]  William A. Yost,et al.  Auditory Perception and Sound Source Determination , 1992 .

[116]  Dario Floreano,et al.  Comparing a brain-inspired robot action selection mechanism with ‘winner-takes-all’ , 2002 .

[117]  T. Hafting,et al.  Microstructure of a spatial map in the entorhinal cortex , 2005, Nature.

[118]  William T. Freeman,et al.  Presented at: 2nd Annual IEEE International Conference on Image , 1995 .

[119]  G H MOWBRAY,et al.  On discriminating the rate of visual flicker and auditory flutter. , 1959, The American journal of psychology.

[120]  George J. Klir,et al.  Fuzzy sets, uncertainty and information , 1988 .

[121]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[122]  W. Walter An Imitation of Life , 1950 .

[123]  Stefan Leutenegger,et al.  Simultaneous Optical Flow and Intensity Estimation from an Event Camera , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[124]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[125]  L. Rayleigh,et al.  XII. On our perception of sound direction , 1907 .

[126]  Alfred Ultsch,et al.  U *-Matrix : a Tool to visualize Clusters in high dimensional Data , 2004 .

[127]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[128]  S. Perrett,et al.  The effect of head rotations on vertical plane sound localization. , 1997, The Journal of the Acoustical Society of America.

[129]  A. Cardoso,et al.  The role of Surprise, Curiosity and Hunger on Exploration of Unknown Environments Populated with Entities , 2005, 2005 portuguese conference on artificial intelligence.

[130]  W R Thurlow,et al.  Head movements during sound localization. , 1967, The Journal of the Acoustical Society of America.

[131]  Philippe Capdepuy,et al.  Maximization of Potential Information Flow as a Universal Utility for Collective Behaviour , 2007, 2007 IEEE Symposium on Artificial Life.

[132]  T. Stanford,et al.  Evaluating the Operations Underlying Multisensory Integration in the Cat Superior Colliculus , 2005, The Journal of Neuroscience.

[133]  Franco Lepore,et al.  Early- and Late-Onset Blind Individuals Show Supra-Normal Auditory Abilities in Far-Space , 2004, Current Biology.

[134]  Leslie G. Ungerleider,et al.  Object vision and spatial vision: two cortical pathways , 1983, Trends in Neurosciences.

[135]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[136]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[137]  Wolfram Burgard,et al.  Improved Techniques for Grid Mapping With Rao-Blackwellized Particle Filters , 2007, IEEE Transactions on Robotics.

[138]  S. Shimojo,et al.  Visual illusion induced by sound. , 2002, Brain research. Cognitive brain research.

[139]  R. Manzotti,et al.  Introduction: Artificial Intelligence and Consciousness , 2007, AAAI Fall Symposium: AI and Consciousness.

[140]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[141]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. I. A new functional anatomy , 2001, Biological Cybernetics.

[142]  J. Taube The head direction signal: origins and sensory-motor integration. , 2007, Annual review of neuroscience.

[143]  E. Koechlin,et al.  Dual Population Coding in the Neocortex: A Model of Interaction between Representation and Attention in the Visual Cortex , 1996, Journal of Cognitive Neuroscience.

[144]  P. May The mammalian superior colliculus: laminar structure and connections. , 2006, Progress in brain research.

[145]  Samuel Kaski,et al.  Bibliography of Self-Organizing Map (SOM) Papers: 1981-1997 , 1998 .

[146]  G. Mangun,et al.  The neural mechanisms of top-down attentional control , 2000, Nature Neuroscience.

[147]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[148]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[149]  Naomi Ehrich Leonard,et al.  Parameter Estimation in Softmax Decision-Making Models With Linear Objective Functions , 2015, IEEE Transactions on Automation Science and Engineering.

[150]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[151]  Peter Redgrave,et al.  A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour , 2001, Biological Cybernetics.

[152]  B. McNaughton,et al.  Spatial information content and reliability of hippocampal CA1 neurons: Effects of visual input , 1994, Hippocampus.

[153]  J. O'Keefe,et al.  The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. , 1971, Brain research.

[154]  Howard L Fields,et al.  Cue‐evoked encoding of movement planning and execution in the rat nucleus accumbens , 2007, The Journal of physiology.

[155]  Michael T. Lippert,et al.  Mechanisms for Allocating Auditory Attention: An Auditory Saliency Map , 2005, Current Biology.

[156]  Gordon Wyeth,et al.  RatSLAM on the Edge: Revealing a Coherent Representation from an Overloaded Rat Brain , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[157]  Shrikanth S. Narayanan,et al.  Saliency-driven unstructured acoustic scene classification using latent perceptual indexing , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[158]  DeLiang Wang,et al.  Binaural Localization of Multiple Sources in Reverberant and Noisy Environments , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[159]  M. Posner,et al.  The attention system of the human brain: 20 years after. , 2012, Annual review of neuroscience.

[160]  K. Jeffery,et al.  Plasticity of the Hippocampal Place Cell Representation , 2004, Reviews in the neurosciences.

[161]  Emilio Kropff,et al.  Place cells, grid cells, and the brain's spatial representation system. , 2008, Annual review of neuroscience.

[162]  M. Meister,et al.  Dynamic predictive coding by the retina , 2005, Nature.

[163]  Thomas J. Wills,et al.  Long-term plasticity in hippocampal place-cell representation of environmental geometry , 2002, Nature.

[164]  M. Corbetta,et al.  Control of goal-directed and stimulus-driven attention in the brain , 2002, Nature Reviews Neuroscience.

[165]  Ning Ma,et al.  The PASCAL CHiME speech separation and recognition challenge , 2013, Comput. Speech Lang..

[166]  R Fendrich,et al.  The temporal cross-capture of audition and vision , 2001, Perception & psychophysics.

[167]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[168]  Ren C. Luo,et al.  Multisensor fusion and integration: approaches, applications, and future research directions , 2002 .

[169]  Cyrill Stachniss,et al.  Simultaneous Localization and Mapping , 2016, Springer Handbook of Robotics, 2nd Ed..

[170]  T. Kohonen Self-organized formation of topology correct feature maps , 1982 .

[171]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[172]  Yariv Ephraim,et al.  A Bayesian estimation approach for speech enhancement using hidden Markov models , 1992, IEEE Trans. Signal Process..

[173]  Janto Skowronek,et al.  Automatic surveillance of the acoustic activity in our living environment , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[174]  C. Spence,et al.  Attention and the crossmodal construction of space , 1998, Trends in Cognitive Sciences.

[175]  Lianhong Cai,et al.  Multi-level Fusion of Audio and Visual Features for Speaker Identification , 2006, ICB.

[176]  Patrick Susini,et al.  The Timbre Toolbox: extracting audio descriptors from musical signals. , 2011, The Journal of the Acoustical Society of America.

[177]  John J. Foxe,et al.  The neural circuitry of pre-attentive auditory change-detection: an fMRI study of pitch and duration mismatch negativity generators. , 2005, Cerebral cortex.

[178]  I. Howard,et al.  Human Spatial Orientation , 1966 .

[179]  Adonis K Moschovakis,et al.  The superior colliculus and eye movement control , 1996, Current Opinion in Neurobiology.

[180]  Daniele Nardi,et al.  Multi‐objective exploration and search for autonomous rescue robots , 2007, J. Field Robotics.

[181]  Leslie G. Ungerleider,et al.  Neural Correlates of Visual Working Memory fMRI Amplitude Predicts Task Performance , 2002, Neuron.

[182]  R. Marois,et al.  Visual Short-Term Memory Load Suppresses Temporo-Parietal Junction Activity and Induces Inattentional Blindness , 2005, Psychological science.

[183]  B. Shinn-Cunningham,et al.  Tori of confusion: binaural localization cues for sources within reach of a listener. , 2000, The Journal of the Acoustical Society of America.

[184]  Shrikanth S. Narayanan,et al.  A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech , 2007, INTERSPEECH.

[185]  Shrikanth S. Narayanan,et al.  Audio retrieval by latent perceptual indexing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[186]  M. Ahissar,et al.  Low-Level Information and High-Level Perception: The Case of Speech in Noise , 2008, PLoS biology.

[187]  Jon Driver,et al.  Covert Spatial Orienting in Audition: Exogenous and Endogenous Mechanisms , 1994 .

[188]  Leslie G. Ungerleider,et al.  ‘What’ and ‘where’ in the human brain , 1994, Current Opinion in Neurobiology.

[189]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[190]  S. Hillyard,et al.  Improved auditory spatial tuning in blind humans , 1999, Nature.

[191]  A. Cools Role of the neostriatal dopaminergic activity in sequencing and selecting behavioural strategies: Facilitation of processes involved in selecting the best strategy in a stressful situation , 1980, Behavioural Brain Research.

[192]  D. H. Warren,et al.  Immediate perceptual response to intersensory discrepancy. , 1980, Psychological bulletin.

[193]  Bruno Gas,et al.  Modulating the auditory turn-to reflex on the basis of multimodal feedback loops: The Dynamic Weighting model , 2015, 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[194]  D. W. Batteau,et al.  The role of the pinna in human localization , 1967, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[195]  David J. Field,et al.  How Close Are We to Understanding V1? , 2005, Neural Computation.

[196]  Pierre-Yves Oudeyer,et al.  R-IAC : Robust Intrinsically Motivated Active Learning , 2009 .

[197]  Lutz Hamel,et al.  Self-Organizing Map Convergence , 2018, Int. J. Serv. Sci. Manag. Eng. Technol..

[198]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[199]  Steven van de Par,et al.  A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[200]  Richard F. Lyon,et al.  Machine Hearing: An Emerging Field , 2010 .

[201]  Andrew McCallum,et al.  Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction , 2001, ICML 2001.

[202]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[203]  D. Ruta,et al.  An Overview of Classifier Fusion Methods , 2000 .

[204]  AG Armin Kohlrausch,et al.  Binaural Localization and Detection of Speakers in Complex Acoustic Scenes , 2013 .

[205]  Pierre-Yves Oudeyer,et al.  How can we define intrinsic motivation , 2008 .

[206]  A. King,et al.  The shape of ears to come: dynamic coding of auditory space , 2001, Trends in Cognitive Sciences.

[207]  J. C. Middlebrooks Sound localization. , 2015, Handbook of clinical neurology.

[208]  Pierre-Yves Oudeyer,et al.  Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[209]  R U Muller,et al.  Comparisons of head direction cell activity in the postsubiculum and anterior thalamus of freely moving rats , 1998, Hippocampus.

[210]  Wolfram Burgard,et al.  Exploration with active loop-closing for FastSLAM , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[211]  M. Corbetta,et al.  The Reorienting System of the Human Brain: From Environment to Theory of Mind , 2008, Neuron.

[212]  J. Kropotov,et al.  Selection of actions in the basal ganglia-thalamocortical circuits: review and model. , 1999, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[213]  Alexandre Bernardino,et al.  Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub , 2008, 2008 IEEE International Conference on Robotics and Automation.

[214]  W R Thurlow,et al.  Effect of induced head movements on localization of direction of sounds. , 1967, The Journal of the Acoustical Society of America.

[215]  B. Kollmeier,et al.  Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition. , 2012, The Journal of the Acoustical Society of America.

[216]  R. Näätänen,et al.  The mismatch negativity (MMN) in basic research of central auditory processing: A review , 2007, Clinical Neurophysiology.

[217]  Jean-Arcady Meyer,et al.  The Psikharpax project: towards building an artificial rat , 2005, Robotics Auton. Syst..

[218]  R. Hampson,et al.  Hippocampal cell firing correlates of delayed-match-to-sample performance in the rat. , 1993, Behavioral neuroscience.

[219]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[220]  D. Gitelman,et al.  Neuroanatomic Overlap of Working Memory and Spatial Attention Networks: A Functional MRI Comparison within Subjects , 1999, NeuroImage.

[221]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[222]  M. Corbetta,et al.  An Event-Related Functional Magnetic Resonance Imaging Study of Voluntary and Stimulus-Driven Orienting of Attention , 2005, The Journal of Neuroscience.

[223]  Hujun Yin,et al.  On the Distribution and Convergence of Feature Space in Self-Organizing Maps , 1995, Neural Computation.

[224]  Jean-Arcady Meyer,et al.  Simulation of adaptive behavior in animats: review and prospect , 1991 .

[225]  R. Näätänen,et al.  Early selective-attention effect on evoked potential reinterpreted. , 1978, Acta psychologica.

[226]  F. Colavita Human sensory dominance , 1974 .

[227]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[228]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[229]  S. Shamma On the Emergence and Awareness of Auditory Objects , 2008, PLoS biology.

[230]  Joachim M. Buhmann,et al.  The Balanced Accuracy and Its Posterior Distribution , 2010, 2010 20th International Conference on Pattern Recognition.

[231]  Jens Blauert A Perceptionist's View on Psychoacoustics , 2012 .

[232]  Timo Honkela,et al.  BIBLIOGRAPHY OF SELF-ORGANIZING MAP (SOM) PAPERS: 2002-2005 ADDENDUM , 2009 .

[233]  Alexei Makarenko,et al.  An experiment in integrated exploration , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[234]  A.V. Oppenheim,et al.  Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.