Assessing Top-Down and Bottom-Up Contributions to Auditory Stream Segregation and Integration With Polyphonic Music

Polyphonic music listening well exemplifies processes typically involved in daily auditory scene analysis situations, relying on an interactive interplay between bottom-up and top-down processes. Most studies investigating scene analysis have used elementary auditory scenes, however real-world scene analysis is far more complex. In particular, music, contrary to most other natural auditory scenes, can be perceived by either integrating or, under attentive control, segregating sound streams, often carried by different instruments. One of the prominent bottom-up cues contributing to multi-instrument music perception is their timbre difference. In this work, we introduce and validate a novel paradigm designed to investigate, within naturalistic musical auditory scenes, attentive modulation as well as its interaction with bottom-up processes. Two psychophysical experiments are described, employing custom-composed two-voice polyphonic music pieces within a framework implementing a behavioral performance metric to validate listener instructions requiring either integration or segregation of scene elements. In Experiment 1, the listeners' locus of attention was switched between individual instruments or the aggregate (i.e., both instruments together), via a task requiring the detection of temporal modulations (i.e., triplets) incorporated within or across instruments. Subjects responded post-stimulus whether triplets were present in the to-be-attended instrument(s). Experiment 2 introduced the bottom-up manipulation by adding a three-level morphing of instrument timbre distance to the attentional framework. The task was designed to be used within neuroimaging paradigms; Experiment 2 was additionally validated behaviorally in the functional Magnetic Resonance Imaging (fMRI) environment. Experiment 1 subjects (N = 29, non-musicians) completed the task at high levels of accuracy, showing no group differences between any experimental conditions. Nineteen listeners also participated in Experiment 2, showing a main effect of instrument timbre distance, even though within attention-condition timbre-distance contrasts did not demonstrate any timbre effect. Correlation of overall scores with morph-distance effects, computed by subtracting the largest from the smallest timbre distance scores, showed an influence of general task difficulty on the timbre distance effect. Comparison of laboratory and fMRI data showed scanner noise had no adverse effect on task performance. These Experimental paradigms enable to study both bottom-up and top-down contributions to auditory stream segregation and integration within psychophysical and neuroimaging experiments.

[1]  Judy Edworthy,et al.  Attending To Two Melodies At Once: the of Key Relatedness , 1981 .

[2]  Stephen McAdams,et al.  Postrecognition of interleaved melodies as an indirect measure of auditory stream formation. , 2003, Journal of experimental psychology. Human perception and performance.

[3]  Yoshitaka Nakajima,et al.  Auditory Scene Analysis: The Perceptual Organization of Sound Albert S. Bregman , 1992 .

[4]  David Wessel,et al.  Timbre Space as a Musical Control Structure , 1979 .

[5]  Peter E. Keller,et al.  The importance of integration and top-down salience when listening to complex multi-part musical stimuli , 2013, NeuroImage.

[6]  Claude Alain,et al.  Auditory Scene Analysis , 2015 .

[7]  C. Lam,et al.  Musician Enhancement for Speech-In-Noise , 2009, Ear and hearing.

[8]  A. Gregory,et al.  Timbre and Auditory Streaming , 1994 .

[9]  J. Rauschecker,et al.  The role of auditory cortex in the formation of auditory streams , 2007, Hearing Research.

[10]  Stephen McAdams,et al.  Schema-based processing in auditory scene analysis , 2002, Perception & psychophysics.

[11]  V. Ciocca The auditory organization of complex sounds. , 2008, Frontiers in bioscience : a journal and virtual library.

[12]  A. Gregory,et al.  Listening to Polyphonic Music , 1990 .

[13]  Robert J. Zatorre,et al.  How restful is it with all that noise? Comparison of Interleaved silent steady state (ISSS) and conventional imaging in resting-state fMRI , 2017, NeuroImage.

[14]  Alan C. Evans,et al.  Event-related fMRI of the auditory cortex. , 1998, NeuroImage.

[15]  F. Dick,et al.  Generality and specificity in the effects of musical expertise on perception and cognition , 2015, Cognition.

[16]  P. Janata,et al.  Listening to polyphonic music recruits domain-general attention and working memory circuits , 2002, Cognitive, affective & behavioral neuroscience.

[17]  John K. Kruschke,et al.  Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan , 2014 .

[18]  H Stanislaw,et al.  Calculation of signal detection theory measures , 1999, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[19]  S. McAdams Musical Timbre Perception , 2013 .

[20]  Shawn A. Weil,et al.  Change detection in multi-voice music: the role of musical structure, musical training, and task demands. , 2002, Journal of experimental psychology. Human perception and performance.

[21]  M. Killion,et al.  Development of a quick speech-in-noise test for measuring signal-to-noise ratio loss in normal-hearing and hearing-impaired listeners. , 2004, The Journal of the Acoustical Society of America.

[22]  C Palmer,et al.  Harmonic, melodic, and frequency height influences in the perception of multivoiced music , 1994, Perception & psychophysics.

[23]  Steve C R Williams,et al.  Acoustic noise and functional magnetic resonance imaging: Current strategies and future prospects , 2002, Journal of magnetic resonance imaging : JMRI.

[24]  I. Winkler,et al.  The role of attention in the formation of auditory streams , 2007, Perception & psychophysics.

[25]  Peter E. Keller,et al.  Segregation and Integration of Auditory Streams when Listening to Multi-Part Music , 2014, PloS one.

[26]  Stephen McAdams,et al.  Divided attention in music , 2000 .

[27]  R. Carlyon,et al.  Effects of attention on auditory perceptual organisation , 2005 .

[28]  Jayaganesh Swaminathan,et al.  Erratum: Musical training, individual differences and the cocktail party problem , 2015, Scientific Reports.

[29]  S. Soli,et al.  Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. , 1994, The Journal of the Acoustical Society of America.

[30]  L L Elliott,et al.  Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. , 1977, The Journal of the Acoustical Society of America.

[31]  Jayaganesh Swaminathan,et al.  Musical training, individual differences and the cocktail party problem , 2015, Scientific Reports.

[32]  N. Kraus,et al.  Music training for the development of auditory skills , 2010, Nature Reviews Neuroscience.

[33]  Stephen McAdams,et al.  Hearing Musical Streams , 2008 .

[34]  I. Peretz,et al.  Brain organization for music processing. , 2005, Annual review of psychology.

[35]  Isabelle Peretz,et al.  The Impact of Musicianship on the Cortical Mechanisms Related to Separating Speech from Background Noise , 2015, Journal of Cognitive Neuroscience.

[36]  I. Nelken,et al.  Neurons and Objects: The Case of Auditory Cortex , 2008, Front. Neurosci..

[37]  C. Schroeder,et al.  The Spectrotemporal Filter Mechanism of Auditory Selective Attention , 2013, Neuron.

[38]  E. Formisano,et al.  Frequency‐Selective Attention in Auditory Scenes Recruits Frequency Representations Throughout Human Superior Temporal Cortex , 2016, Cerebral cortex.

[39]  D. Deutsch 6 – Grouping Mechanisms in Music , 2013 .

[40]  C. Schroeder,et al.  Tuning of the Human Neocortex to the Temporal Dynamics of Attended Events , 2011, The Journal of Neuroscience.

[41]  Robert J. Zatorre,et al.  The Music-In-Noise Task (MINT): A Tool for Dissecting Complex Auditory Perception , 2019, Front. Neurosci..

[42]  Robert J. Zatorre,et al.  Musical training sharpens and bonds ears and tongue to hear speech better , 2017, Proceedings of the National Academy of Sciences.

[43]  Michael W. Weiss,et al.  Coordinated plasticity in brainstem and auditory cortex contributes to enhanced categorical speech perception in musicians , 2014, The European journal of neuroscience.

[44]  Josh H McDermott,et al.  Music Perception, Pitch, and the Auditory System This Review Comes from a Themed Issue on Sensory Systems Edited Pitch Relations across Time—relative Pitch Relative Pitch—behavioral Evidence Neural Mechanisms of Relative Pitch Representation of Simultaneous Pitches— Chords and Polyphony Summary and , 2022 .

[45]  S. Shamma,et al.  Behind the scenes of auditory perception , 2010, Current Opinion in Neurobiology.

[46]  Albert S. Bregman Progress in Understanding Auditory Scene Analysis , 2015 .

[47]  Joseph S. Gati,et al.  There's more than one way to scan a cat: Imaging cat auditory cortex with high-field fMRI using continuous or sparse sampling , 2014, Journal of Neuroscience Methods.

[48]  Jeremy Marozeau,et al.  The Effect of Timbre and Loudness on Melody Segregation , 2013 .

[49]  S. Pinker,et al.  Auditory streaming and the building of timbre. , 1978, Canadian journal of psychology.

[50]  Hideki Kawahara,et al.  Auditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[51]  Stephen McAdams,et al.  Perceptual organization of complex auditory sequences: effect of number of simultaneous subsequences and frequency separation. , 1999, Journal of experimental psychology. Human perception and performance.

[52]  Leon van Noorden,et al.  Minimum differences of level and frequency for perceptual fission of tone sequences ABAB , 1977 .

[53]  Elyse S Sussman,et al.  Integration and segregation in auditory scene analysis. , 2005, The Journal of the Acoustical Society of America.

[54]  R Cusack,et al.  Effects of differences in timbre on sequential grouping , 2000, Perception & psychophysics.

[55]  Stephen McAdams Timbre as a structuring force in music , 2013 .

[56]  R. Carlyon How the brain separates sounds , 2004, Trends in Cognitive Sciences.

[57]  R. Zatorre,et al.  Cortical Processing of Music , 2012 .

[58]  L E Marks,et al.  Interaction among auditory dimensions: Timbre, pitch, and loudness , 1990, Perception & psychophysics.

[59]  Robert J. Zatorre,et al.  Neural Correlates of Early Sound Encoding and their Relationship to Speech-in-Noise Perception , 2017, Front. Neurosci..

[60]  Stuart W. Leslie Laboratory architecture: Building for an uncertain future , 2010 .

[61]  R. Carlyon,et al.  Effects of location, frequency region, and time course of selective attention on auditory scene analysis. , 2004, Journal of experimental psychology. Human perception and performance.

[62]  B. Ross,et al.  COGNITIVE NEUROSCIENCE AND NEUROPSYCHOLOGY: Timbre-specific enhancement of auditory cortical representations in musicians , 2022 .

[63]  Diana Deutsch Hearing music in ensembles , 2010 .

[64]  Richard H. Wilson,et al.  Development of a speech-in-multitalker-babble paradigm to assess word-recognition performance. , 2003, Journal of the American Academy of Audiology.

[65]  Robert J. Zatorre,et al.  Speech-in-noise perception in musicians: A review , 2017, Hearing Research.