Explorer Multisensory Oddity Detection as Bayesian Inference

A key goal for the perceptual system is to optimally combine information from all the senses that may be available in order to develop the most accurate and unified picture possible of the outside world. The contemporary theoretical framework of ideal observer maximum likelihood integration (MLI) has been highly successful in modelling how the human brain combines information from a variety of different sensory modalities. However, in various recent experiments involving multisensory stimuli of uncertain correspondence, MLI breaks down as a successful model of sensory combination. Within the paradigm of direct stimulus estimation, perceptual models which use Bayesian inference to resolve correspondence have recently been shown to generalize successfully to these cases where MLI fails. This approach has been known variously as model inference, causal inference or structure inference. In this paper, we examine causal uncertainty in another important class of multi-sensory perception paradigm – that of oddity detection and demonstrate how a Bayesian ideal observer also treats oddity detection as a structure inference problem. We validate this approach by showing that it provides an intuitive and quantitative explanation of an important pair of multi-sensory oddity detection experiments – involving cues across and within modalities – for which MLI previously failed dramatically, allowing a novel unifying treatment of within and cross modal multisensory perception. Our successful application of structure inference models to the new ‘oddity detection’ paradigm, and the resultant unified explanation of across and within modality cases provide further evidence to suggest that structure inference may be a commonly evolved principle for combining perceptual information in the brain. Citation: Hospedales T, Vijayakumar S (2009) Multisensory Oddity Detection as Bayesian Inference. PLoS ONE 4(1): e4205. doi:10.1371/journal.pone.0004205 Editor: Hiroaki Matsunami, Duke Unviersity, United States of America Received August 15, 2008; Accepted December 1, 2008; Published January 15, 2009 Copyright: 2009 Hospedales et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: TMH was supported by the Neuroinformatics Doctoral Training Center (Neuroinformatics DTC) at the University of Edinburgh. SV is supported by a fellowship of the Royal Academy of Engineering in Learning Robotics, co-sponsored by Microsoft Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: t.hospedales@ed.ac.uk

[1]  Sethu Vijayakumar,et al.  Structure Inference for Bayesian Multisensor Scene Understanding , 2007 .

[2]  W. Richards,et al.  Perception as Bayesian Inference , 2008 .

[3]  Kazuyuki Aihara,et al.  Bayesian Inference Explains Perception of Unity and Ventriloquism Aftereffect: Identification of Common Sources of Audiovisual Stimuli , 2007, Neural Computation.

[4]  Konrad Paul Kording,et al.  Causal Inference in Multisensory Perception , 2007, PloS one.

[5]  D. Knill Robust cue integration: a Bayesian model and evidence from cue-conflict studies with stereoscopic and figure cues to slant. , 2007, Journal of vision.

[6]  M. Ernst Learning to integrate arbitrary signals from vision and touch. , 2007, Journal of vision.

[7]  Sethu Vijayakumar,et al.  Structure Inference for Bayesian Multisensory Perception and Tracking , 2007, IJCAI.

[8]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[9]  Neil W. Roach,et al.  Resolving multisensory conflict: a strategy for balancing the costs and benefits of audio-visual integration , 2006, Proceedings of the Royal Society B: Biological Sciences.

[10]  Jean-Pierre Bresciani,et al.  Vision and touch are automatically integrated for the perception of sequences of events. , 2006, Journal of vision.

[11]  S. Gepshtein,et al.  The combination of vision and touch depends on spatial proximity. , 2005, Journal of vision.

[12]  Ulrik R Beierholm,et al.  Sound-induced flash illusion as an optimal percept , 2005, Neuroreport.

[13]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[14]  James M. Hillis,et al.  Slant from texture and disparity cues: optimal cue combination. , 2004, Journal of vision.

[15]  Konrad Paul Körding,et al.  The loss function of sensorimotor learning. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[16]  M. Wallace,et al.  Unifying multisensory signals across time and space , 2004, Experimental Brain Research.

[17]  H. Bülthoff,et al.  Merging the senses into a robust percept , 2004, Trends in Cognitive Sciences.

[18]  D. Burr,et al.  The Ventriloquist Effect Results from Near-Optimal Bimodal Integration , 2004, Current Biology.

[19]  A. Yuille,et al.  Object perception as Bayesian inference. , 2004, Annual review of psychology.

[20]  R. Zemel,et al.  Inference and computation with population codes. , 2003, Annual review of neuroscience.

[21]  J. Saunders,et al.  Do humans optimally integrate stereo and texture information for judgments of surface slant? , 2003, Vision Research.

[22]  Robert A Jacobs,et al.  Bayesian integration of visual and auditory signals for spatial localization. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[23]  S. Gepshtein,et al.  Viewing Geometry Determines How Vision and Haptics Combine in Size Perception , 2003, Current Biology.

[24]  James M. Hillis,et al.  Combining Sensory Information: Mandatory Fusion Within, but Not Between, Senses , 2002, Science.

[25]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[26]  M S Landy,et al.  Ideal cue combination for localizing texture-defined edges. , 2001, Journal of the Optical Society of America. A, Optics, image science, and vision.

[27]  S. Shimojo,et al.  Illusions: What you see is what you hear , 2000, Nature.

[28]  R. Jacobs,et al.  Optimal integration of texture and motion cues to depth , 1999, Vision Research.

[29]  D. Knill,et al.  Discrimination of planar surface slant from texture: human and ideal observers compared , 1998, Vision Research.

[30]  David Heckerman,et al.  Knowledge Representation and Inference in Similarity Networks and Bayesian Multinets , 1996, Artif. Intell..

[31]  M. Landy,et al.  Measurement and modeling of depth cue combination: in defense of weak fusion , 1995, Vision Research.

[32]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[33]  Y. Bar-Shalom,et al.  Tracking in a cluttered environment with probabilistic data association , 1975, Autom..

[34]  Neil A. Macmillan,et al.  Detection theory: A user's guide, 2nd ed. , 2005 .

[35]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[36]  M. Wallace,et al.  Visual Localization Ability Influences Cross-Modal Bias , 2003, Journal of Cognitive Neuroscience.