Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers.

Three experiments used the Coordinated Response Measure task to examine the roles that differences in F0 and differences in vocal-tract length have on the ability to attend to one of two simultaneous speech signals. The first experiment asked how increases in the natural F0 difference between two sentences (originally spoken by the same talker) affected listeners' ability to attend to one of the sentences. The second experiment used differences in vocal-tract length, and the third used both F0 and vocal-tract length differences. Differences in F0 greater than 2 semitones produced systematic improvements in performance. Differences in vocal-tract length produced systematic improvements in performance when the ratio of lengths was 1.08 or greater, particularly when the shorter vocal tract belonged to the target talker. Neither of these manipulations produced improvements in performance as great as those produced by a different-sex talker. Systematic changes in both F0 and vocal-tract length that simulated an incremental shift in gender produced substantially larger improvements in performance than did differences in F0 or vocal-tract length alone. In general, shifting one of two utterances spoken by a female voice towards a male voice produces a greater improvement in performance than shifting male towards female. The increase in performance varied with the intonation patterns of individual talkers, being smallest for those talkers who showed most variability in their intonation patterns between different utterances.

[1]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[2]  E. Carterette,et al.  Some Factors Affecting Multi‐Channel Listening , 1954 .

[3]  D D Dirks,et al.  Masking effects of speech competing messages. , 1969, Journal of speech and hearing research.

[4]  B. Atal,et al.  Speech analysis and synthesis by linear prediction of the speech wave. , 1971, The Journal of the Acoustical Society of America.

[5]  E. A. Flinn Comments on “Speech Analysis and Synthesis by Linear Prediction of the Speech Wave” [B. S. Atal and S. L. Hanauer, J. Acoust. Soc. Amer. 50, 637–655 (1971)] , 1972 .

[6]  Roger Ratcliff,et al.  A revised table of d’ for M-alternative forced choice , 1979 .

[7]  S. G. Nooteboom,et al.  Intonation and the perceptual separation of simultaneous voices , 1982 .

[8]  Michaël Titus Maria Scheffers,et al.  Sifting vowels. Auditory pitch analysis and sound segregation. , 1983 .

[9]  Q. Summerfield,et al.  Modeling the perception of concurrent vowels: vowels with the same fundamental frequency. , 1989, The Journal of the Acoustical Society of America.

[10]  Eric Moulines,et al.  Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[11]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[12]  Q. Summerfield,et al.  Modeling the perception of concurrent vowels: vowels with different fundamental frequencies. , 1990, The Journal of the Acoustical Society of America.

[13]  R. Plomp,et al.  Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. , 1990, The Journal of the Acoustical Society of America.

[14]  Tohru Takagi,et al.  Acoustic parameters of voice individuality and voice-quality control by analysis-synthesis method , 1991, Speech Commun..

[15]  J. Bird Effects of a difference in fundamental frequency in separating two sentences. , 1997 .

[16]  R. W. Hukin,et al.  Effectiveness of spatial cues, prosody, and talker characteristics in selective attention. , 2000, The Journal of the Acoustical Society of America.

[17]  W. T. Nelson,et al.  A speech corpus for multitalker communications research. , 2000, The Journal of the Acoustical Society of America.

[18]  M. Ericson,et al.  Informational and energetic masking effects in the perception of multiple simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[19]  M. Ericson,et al.  The Intelligibility of Multiple Talkers Separated Spatially in Noise , 2001 .

[20]  D S Brungart,et al.  Informational and energetic masking effects in the perception of two simultaneous talkers. , 2001, The Journal of the Acoustical Society of America.

[21]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .