Predictive coding in auditory perception: challenges and unresolved questions

Predictive coding is arguably the currently dominant theoretical framework for the study of perception. It has been employed to explain important auditory perceptual phenomena, and it has inspired theoretical, experimental and computational modelling efforts aimed at describing how the auditory system parses the complex sound input into meaningful units (auditory scene analysis). These efforts have uncovered some vital questions, addressing which could help to further specify predictive coding and clarify some of its basic assumptions. The goal of the current review is to motivate these questions and show how unresolved issues in explaining some auditory phenomena lead to general questions of the theoretical framework. We focus on experimental and computational modelling issues related to sequential grouping in auditory scene analysis (auditory pattern detection and bistable perception), as we believe that this is the research topic where predictive coding has the highest potential for advancing our understanding. In addition to specific questions, our analysis led us to identify three more general questions that require further clarification: (1) What exactly is meant by prediction in predictive coding? (2) What governs which generative models make the predictions? and (3) What (if it exists) is the correlate of perceptual experience within the predictive coding framework?

[1]  Michael W. Spratling Predictive coding as a model of biased competition in visual attention , 2008, Vision Research.

[2]  I. Nelken,et al.  Processing of low-probability sounds by cortical neurons , 2003, Nature Neuroscience.

[3]  I. Winkler Interpreting the Mismatch Negativity , 2007 .

[4]  R. Gregory Perceptions as hypotheses. , 1980, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[5]  Michael W. Spratling A Hierarchical Predictive Coding Model of Object Recognition in Natural Images , 2016, Cognitive Computation.

[6]  S. Koelsch,et al.  Predictive information processing in music cognition. A critical review. , 2012, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[7]  L. V. Noorden Temporal coherence in the perception of tone sequences , 1975 .

[8]  Kara D. Federmeier Thinking ahead: the role and roots of prediction in language comprehension. , 2007, Psychophysiology.

[9]  T. Baldeweg Repetition effects to sounds: evidence for predictive coding in the auditory system , 2006, Trends in Cognitive Sciences.

[10]  M. Kubovy,et al.  Auditory and visual objects , 2001, Cognition.

[11]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[12]  W. Fitch,et al.  Annals of the New York Academy of Sciences Hierarchical Processing in Music, Language, and Action: Lashley Revisited , 2022 .

[13]  Stefan Koelsch,et al.  Effects of veridical expectations on syntax processing in music: Event-related potential evidence , 2016, Scientific Reports.

[14]  Brian C J Moore,et al.  Multistability in perception: binding sensory modalities, an overview , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[15]  T W Picton,et al.  ON and OFF components in the auditory evoked potential , 1978, Perception & psychophysics.

[16]  I. Winkler,et al.  Preventing distraction: Assessing stimulus-specific and general effects of the predictive cueing of deviant auditory events , 2011, Biological Psychology.

[17]  Naftali Tishby,et al.  The Representation of Prediction Error in Auditory Cortex , 2016, PLoS Comput. Biol..

[18]  Frédéric Berthommier,et al.  Effect of rhythmic attention on the segregation of interleaved melodies. , 2010, The Journal of the Acoustical Society of America.

[19]  L. Demany,et al.  Limits of rhythm perception , 2002, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[20]  E. Schröger,et al.  Age-related changes in the use of regular patterns for auditory scene analysis , 2012, Hearing Research.

[21]  Christoph Scheepers,et al.  Integration of Syntactic and Semantic Information in Predictive Processing: Cross-Linguistic Evidence from German and English , 2003, Journal of psycholinguistic research.

[22]  I. Winkler,et al.  Auditory perceptual objects as generative models: Setting the stage for communication by sound , 2015, Brain and Language.

[23]  Karl J. Friston,et al.  Is predictability salient? A study of attentional capture by auditory patterns , 2017, Philosophical Transactions of the Royal Society B: Biological Sciences.

[24]  Karl J. Friston,et al.  Frontiers in Neuroinformatics , 2022 .

[25]  W. Dowling Emotion and Meaning in Music , 2008 .

[26]  R. Näätänen The role of attention in auditory information processing as revealed by event-related potentials and other brain measures of cognitive function , 1990, Behavioral and Brain Sciences.

[27]  I. Winkler,et al.  Mismatch negativity is unaffected by top-down predictive information , 2001, Neuroreport.

[28]  Israel Nelken,et al.  Auditory Streaming as an Online Classification Process with Evidence Accumulation , 2015, PloS one.

[29]  Naoki Kogo,et al.  Is predictive coding theory articulated enough to be testable? , 2015, Front. Comput. Neurosci..

[30]  R. Ilmoniemi,et al.  Temporal window of integration of auditory information in the human brain. , 1998, Psychophysiology.

[31]  E. Brunswik Perception and the Representative Design of Psychological Experiments , 1957 .

[32]  I. Winkler,et al.  The role of predictive models in the formation of auditory streams , 2006, Journal of Physiology-Paris.

[33]  Karl J. Friston,et al.  Predictive coding explains binocular rivalry: An epistemological review , 2008, Cognition.

[34]  I. Winkler,et al.  Different roles of similarity and predictability in auditory stream segregation , 2013 .

[35]  D. Wang,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2008, IEEE Trans. Neural Networks.

[36]  Rick L. Jenison,et al.  Auditory Spatial Layout , 1995 .

[37]  A. Bregman,et al.  Effects of task-switching on neural representations of ambiguous sound input , 2014, Neuropsychologia.

[38]  David McAlpine,et al.  Cortical responses to changes in acoustic regularity are differentially modulated by attentional load , 2012, NeuroImage.

[39]  M. Chait,et al.  Great expectations: Is there evidence for predictive coding in auditory cortex? , 2017, Neuroscience.

[40]  A. Kral Auditory critical periods: A review from system’s perspective , 2013, Neuroscience.

[41]  D. Mumford On the computational architecture of the neocortex , 2004, Biological Cybernetics.

[42]  Michael W. Spratling A review of predictive coding algorithms , 2017, Brain and Cognition.

[43]  Karl J. Friston,et al.  Active inference, communication and hermeneutics , 2015, Cortex.

[44]  D. Vernon,et al.  Event-Related Brain Potential Correlates of Human Auditory Sensory Memory-Trace Formation , 2005, The Journal of Neuroscience.

[45]  M. Chait,et al.  Brain Bases for Auditory Stimulus-Driven Figure–Ground Segregation , 2011, The Journal of Neuroscience.

[46]  I. Winkler,et al.  Evidence from auditory and visual event-related potential (ERP) studies of deviance detection (MMN and vMMN) linking predictive coding theories and perceptual object representations. , 2012, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[47]  Karl J. Friston,et al.  Action understanding and active inference , 2011, Biological Cybernetics.

[48]  J. Hupé,et al.  Temporal Dynamics of Auditory and Visual Bistability Reveal Common Principles of Perceptual Organization , 2006, Current Biology.

[49]  Stuart Anstis,et al.  Adaptation to auditory streaming of frequency-modulated tones. , 1985 .

[50]  Daniel Pressnitzer,et al.  Rapid Formation of Robust Auditory Memories: Insights from Noise , 2010, Neuron.

[51]  Karl J. Friston,et al.  Canonical Microcircuits for Predictive Coding , 2012, Neuron.

[52]  J. Changeux,et al.  Experimental and Theoretical Approaches to Conscious Processing , 2011, Neuron.

[53]  Q. Summerfield Book Review: Auditory Scene Analysis: The Perceptual Organization of Sound , 1992 .

[54]  Olaf Sporns,et al.  Small worlds inside big brains , 2006, Proceedings of the National Academy of Sciences.

[55]  Albert S. Bregman,et al.  Role of predictability of sequence in auditory stream segregation , 1989, Perception & psychophysics.

[56]  Alexandra Bendixen,et al.  Predictability effects in auditory scene analysis: a review , 2014, Front. Neurosci..

[57]  Jakob Hohwy,et al.  Functional integration and the mind , 2007, Synthese.

[58]  E. Sussman A New View on the MMN and Attention Debate The Role of Context in Processing Auditory Events , 2007 .

[59]  L. H. Anauer,et al.  Speech Analysis and Synthesis by Linear Prediction of the Speech Wave , 2000 .

[60]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[61]  T. Griffiths,et al.  What is an auditory object? , 2004, Nature Reviews Neuroscience.

[62]  Jordi Costa-Faidella,et al.  Multiple time scales of adaptation in the auditory system as revealed by human evoked potentials. , 2011, Psychophysiology.

[63]  C. Stevens,et al.  Sweet Anticipation: Music and the Psychology of Expectation, by David Huron . Cambridge, Massachusetts: MIT Press, 2006 , 2007 .

[64]  M. R. Jones,et al.  Dynamic attending and responses to time. , 1989, Psychological review.

[65]  Dylan M. Jones,et al.  Explaining the irrelevant-sound effect: Temporal distinctiveness or changing state? , 1999 .

[66]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[67]  Alexandra Bendixen,et al.  Stable individual characteristics in the perception of multiple embedded patterns in multistable auditory stimuli , 2014, Front. Neurosci..

[68]  Karl J. Friston,et al.  The functional anatomy of the MMN: A DCM study of the roving paradigm , 2008, NeuroImage.

[69]  Marta Kutas,et al.  CHAPTER 15 A Look around at What Lies Ahead: Prediction and Predictability in Language Processing , 2010 .

[70]  R. Hari,et al.  Omissions of Auditory Stimuli May Activate Frontal Cortex , 1989, The European journal of neuroscience.

[71]  Karl J. Friston,et al.  Brain responses in humans reveal ideal observer-like sensitivity to complex acoustic patterns , 2016, Proceedings of the National Academy of Sciences.

[72]  M. Sams,et al.  Disrupting human auditory change detection: Chopin is superior to white noise. , 1997, Psychophysiology.

[73]  P. Paavilainen,et al.  Preattentive detection of nonsalient contingencies between auditory features , 2007, Neuroreport.

[74]  N. Turk-Browne,et al.  Attention Is Spontaneously Biased Toward Regularities , 2013, Psychological science.

[75]  M. R. Jones,et al.  Time, our lost dimension: toward a new theory of perception, attention, and memory. , 1976, Psychological review.

[76]  J. Duncan Selective attention and the organization of visual information. , 1984, Journal of experimental psychology. General.

[77]  Maria Chait,et al.  The Timing of Change Detection and Change Perception in Complex Acoustic Scenes , 2012, Front. Psychology.

[78]  I. Winkler,et al.  Neural representation for the temporal structure of sound patterns. , 1995, Neuroreport.

[79]  D Mumford,et al.  On the computational architecture of the neocortex. II. The role of cortico-cortical loops. , 1992, Biological cybernetics.

[80]  Karl J. Friston,et al.  Free-Energy Minimization and the Dark-Room Problem , 2012, Front. Psychology.

[81]  I. Nelken,et al.  Modeling the auditory scene: predictive regularity representations and perceptual objects , 2009, Trends in Cognitive Sciences.

[82]  Leonard B. Meyer,et al.  Music, the arts, and ideas : patterns and predictions in twentieth-century culture , 1968 .

[83]  H. Tiitinen,et al.  Mismatch negativity (MMN), the deviance-elicited auditory deflection, explained. , 2010, Psychophysiology.

[84]  I. Winkler,et al.  Auditory processing that leads to conscious perception: a unique window to central auditory processing opened by the mismatch negativity and related responses. , 2011, Psychophysiology.

[85]  Dylan M. Jones,et al.  Organizational factors in selective attention: The interplay of acoustic distinctiveness and auditory streaming in the irrelevant sound effect , 1999 .

[86]  Alexandra Bendixen,et al.  Regular patterns stabilize auditory streams. , 2010, The Journal of the Acoustical Society of America.

[87]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[88]  C. Koch,et al.  Consciousness: here, there and everywhere? , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[89]  Risto Näätänen,et al.  Implicit, Intuitive, and Explicit Knowledge of Abstract Regularities in a Sound Sequence: An Event-related Brain Potential Study , 2006, Journal of Cognitive Neuroscience.

[90]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[91]  Maria Chait,et al.  The role of temporal regularity in auditory segregation , 2011, Hearing Research.

[92]  I. Winkler,et al.  Top-down effects can modify the initially stimulus-driven auditory organization. , 2002, Brain research. Cognitive brain research.

[93]  M. Kutas,et al.  Brain potentials during reading reflect word expectancy and semantic association , 1984, Nature.

[94]  A S Bregman,et al.  An experimental evaluation of three theories of auditory stream segregation , 1993, Perception & psychophysics.

[95]  Alexandra Bendixen,et al.  The effects of rhythm and melody on auditory stream segregation. , 2014, The Journal of the Acoustical Society of America.

[96]  Sue L. Denham,et al.  Modelling the Emergence and Dynamics of Perceptual Organisation in Auditory Streaming , 2013, PLoS Comput. Biol..

[97]  David J. M. Kraemer,et al.  Musical imagery: Sound of silence activates auditory cortex , 2005, Nature.

[98]  Karl J. Friston,et al.  Predictive coding: an account of the mirror neuron system , 2007, Cognitive Processing.

[99]  Christian Kaernbach,et al.  The memory of noise. , 2004, Experimental psychology.

[100]  J. Changeux,et al.  A Neuronal Model of Predictive Coding Accounting for the Mismatch Negativity , 2012, The Journal of Neuroscience.

[101]  Alexandra Bendixen,et al.  Feature predictability flexibly supports auditory stream segregation or integration , 2014 .

[102]  Sue L. Denham,et al.  A Neurocomputational Model of Stimulus-Specific Adaptation to Oddball and Markov Sequences , 2011, PLoS Comput. Biol..

[103]  M. Chait,et al.  Detecting and representing predictable structure during auditory scene analysis , 2016, eLife.

[104]  I. Winkler,et al.  I Heard That Coming: Event-Related Potential Evidence for Stimulus-Driven Prediction in the Auditory System , 2009, The Journal of Neuroscience.

[105]  C. K. Yuen,et al.  Theory and Application of Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.