Effects of differences in fundamental frequency on across-formant grouping in speech perception.

In an isolated syllable, a formant will tend to be segregated perceptually if its fundamental frequency (F0) differs from that of the other formants. This study explored whether similar results are found for sentences, and specifically whether differences in F0 (ΔF0) also influence across-formant grouping in circumstances where the exclusion or inclusion of the manipulated formant critically determines speech intelligibility. Three-formant (F1 + F2 + F3) analogues of almost continuously voiced natural sentences were synthesized using a monotonous glottal source (F0 = 150 Hz). Perceptual organization was probed by presenting stimuli dichotically (F1 + F2C + F3; F2), where F2C is a competitor for F2 that listeners must resist to optimize recognition. Competitors were created using time-reversed frequency and amplitude contours of F2, and F0 was manipulated (ΔF0 = ± 8, ± 2, or 0 semitones relative to the other formants). Adding F2C typically reduced intelligibility, and this reduction was greatest when ΔF0 = 0. There was an additional effect of absolute F0 for F2C, such that competitor efficacy was greater for higher F0s. However, competitor efficacy was not due to energetic masking of F3 by F2C. The results are consistent with the proposal that a grouping "primitive" based on common F0 influences the fusion and segregation of concurrent formants in sentence perception.

[1]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[2]  J. Culling,et al.  Perceptual and computational separation of simultaneous vowels: cues arising from low-frequency beating. , 1994, The Journal of the Acoustical Society of America.

[3]  Jennifer S. Pardo,et al.  On the perceptual organization of speech. , 1994, Psychological review.

[4]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[5]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[6]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[7]  D. Broadbent Listening to one of two synchronous messages. , 1952, Journal of experimental psychology.

[8]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[9]  Matthew H. Davis,et al.  Lexical information drives perceptual learning of distorted speech: evidence from the comprehension of noise-vocoded sentences. , 2005, Journal of experimental psychology. General.

[10]  V Ball,et al.  Lip-reading the BKB sentence lists: corrections for list and practice effects. , 1993, British journal of audiology.

[11]  A. M. Mimpen,et al.  The ear as a frequency analyzer. II. , 1964, The Journal of the Acoustical Society of America.

[12]  John F Culling,et al.  The role of fundamental frequency contours in the perception of speech against interfering speech. , 2005, The Journal of the Acoustical Society of America.

[13]  T. C. Rand,et al.  Dichotic release from masking for speech , 1974 .

[14]  R. B. Gardner,et al.  Perceptual grouping of formants with static and dynamic differences in fundamental frequency , 1989 .

[15]  Peter Ladefoged,et al.  On the Fusion of Sounds Reaching Different Sense Organs , 1957 .

[16]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[17]  Robert E. Remez,et al.  Perceptual organization of speech in one and several modalities: common functions, common resources , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[18]  S. Beristain The Acoustical Society of America , 1964, Nature.

[19]  James W. Moore,et al.  Institute of Electrical and Electronics Engineers (IEEE) , 2002 .

[20]  J. Bird Effects of a difference in fundamental frequency in separating two sentences. , 1997 .

[21]  J Bamford,et al.  The BKB (Bamford-Kowal-Bench) sentence lists for partially-hearing children. , 1979, British journal of audiology.

[22]  C. Darwin,et al.  The Quarterly Journal of Experimental Psychology Section a Human Experimental Psychology Perceptual Grouping of Speech Components Differing in Fundamental Frequency and Onset-time Perceptual Grouping of Speech Components Differing in Fundamental Frequency and Onset-time , 2022 .

[23]  J. Cutting Auditory and linguistic processes in speech perception: inferences from six fusions in dichotic listening. , 1976, Psychological review.

[24]  R J Stubbs,et al.  Algorithms for separating the speech of interfering talkers: evaluations with voiced sentences, and normal-hearing and hearing-impaired listeners. , 1990, The Journal of the Acoustical Society of America.

[25]  James E. Cutting,et al.  Two left-hemisphere mechanisms in speech perception , 1974 .

[26]  S. G. Nooteboom,et al.  Intonation and the perceptual separation of simultaneous voices , 1982 .

[27]  Allan D. Pierce,et al.  Acoustical Society of America , 2005 .

[28]  T. W. Parsons Separation of speech from interfering speech by means of harmonic selection , 1976 .

[29]  C J Darwin,et al.  Listening to speech in the presence of other sounds , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[30]  R. Remez,et al.  Perceptual Organization of Speech , 2008, The Handbook of Speech Perception.

[31]  Michaël Titus Maria Scheffers,et al.  Sifting vowels. Auditory pitch analysis and sound segregation. , 1983 .

[32]  Daniel P. W. Ellis,et al.  The auditory organization of speech and other sources in listeners and computational models , 2001, Speech Commun..

[33]  C. Darwin,et al.  Perceptual separation of simultaneous vowels: within and across-formant grouping by F0. , 1993, The Journal of the Acoustical Society of America.

[34]  M. Studdert-Kennedy,et al.  Identification of Consonants and Vowels Presented to Left and Right Ears* , 1967, The Quarterly journal of experimental psychology.

[35]  C. L. Thompson,et al.  Dichotic speech perception: an interpretation of right-ear advantage and temporal offset effects. , 1973, The Journal of the Acoustical Society of America.

[36]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[37]  W. Ainsworth Perception of speech sounds with alternate formants presented to opposite ears. , 1978, The Journal of the Acoustical Society of America.

[38]  R. J. Summers,et al.  The perceptual organization of noise-vocoded speech under competitive conditions. , 2010 .

[39]  A. Bregman,et al.  The perceptual segregation of simultaneous auditory signals: Pulse train segregation and vowel segregation , 1989, Perception & psychophysics.

[40]  P. N. Denbigh,et al.  Pitch extraction and separation of overlapping speech , 1991, Speech Commun..

[41]  C. Darwin Ear Differences in the Recall of Fricatives and Vowels , 1971, The Quarterly journal of experimental psychology.

[42]  D. Pisoni,et al.  Speech perception without traditional speech cues. , 1981, Science.

[43]  A. Rosenberg Effect of glottal pulse shape on the quality of natural vowels. , 1969, The Journal of the Acoustical Society of America.

[44]  D. Broadbent Failures of attention in selective listening. , 1952, Journal of experimental psychology.

[45]  Q Summerfield,et al.  Perception of concurrent vowels: effects of harmonic misalignment and pitch-period asynchrony. , 1991, The Journal of the Acoustical Society of America.

[46]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[47]  Bryden Mp Ear preference in auditory perception. , 1963 .

[48]  C. Darwin,et al.  Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. , 2003, The Journal of the Acoustical Society of America.

[49]  John M Deeks,et al.  Simulations of cochlear implant hearing using filtered harmonic complexes: implications for concurrent sound segregation. , 2004, The Journal of the Acoustical Society of America.

[50]  G. W. Snedecor Statistical Methods , 1964 .

[51]  G. Keppel,et al.  Design and Analysis: A Researcher's Handbook , 1976 .

[52]  DeLiang Wang,et al.  A Tandem Algorithm for Pitch Estimation and Voiced Speech Segregation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[53]  G. Fant,et al.  Two-formant Models, Pitch and Vowel Perception , 1975 .