Voice segregation by difference in fundamental frequency: evidence for harmonic cancellation.

Two experiments investigated listeners' ability to use a difference of two semitones in fundamental frequency (F0) to segregate a target voice from harmonic complex tones, with speech-like spectral profiles. Masker partials were in random phase (experiment 1) or in sine phase (experiment 2) and stimuli were presented over headphones. Target's and masker's harmonicity were each distorted by F0 modulation and reverberation. The F0 of each source was manipulated (monotonized or modulated by 2 semitones at 5 Hz) factorially. In addition, all sources were presented from the same location in a virtual room with controlled reverberation, assigned factorially to each source. In both experiments, speech reception thresholds increased by about 2 dB when the F0 of the masker was modulated and increased by about 6 dB when, in addition to F0 modulation, the masker was reverberant. Masker partial phases did not influence the results. The results suggest that F0-segregation relies upon the masker's harmonicity, which is disrupted by rapid modulation. This effect is compounded by reverberation. In addition, F0-segregation was found to be independent of the depth of masker envelope modulations.

[1]  C. Darwin,et al.  Perceptual separation of simultaneous vowels: within and across-formant grouping by F0. , 1993, The Journal of the Acoustical Society of America.

[2]  John F. Culling,et al.  Effects of simulated reverberation on the use of binaural cues and fundamental-frequency differences for separating concurrent vowels , 1994, Speech Commun..

[3]  C. Darwin,et al.  Effects of onset asynchrony on pitch perception: adaptation or grouping? , 1993, The Journal of the Acoustical Society of America.

[4]  T Dau,et al.  Towards a measure of auditory-filter phase response. , 2001, The Journal of the Acoustical Society of America.

[5]  A Kohlrausch,et al.  Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets. , 1995, The Journal of the Acoustical Society of America.

[6]  Michaël Titus Maria Scheffers,et al.  Sifting vowels. Auditory pitch analysis and sound segregation. , 1983 .

[7]  J. Culling,et al.  Perceptual and computational separation of simultaneous vowels: cues arising from low-frequency beating. , 1994, The Journal of the Acoustical Society of America.

[8]  John F. Culling Signal-processing software for teaching and research in psychoacoustics under UNIX and X-Windows , 1996 .

[9]  Chaz Yee Toh,et al.  Effects of reverberation on perceptual segregation of competing voices. , 2003, The Journal of the Acoustical Society of America.

[10]  T Houtgast,et al.  A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[11]  Q Summerfield,et al.  Perception of concurrent vowels: effects of harmonic misalignment and pitch-period asynchrony. , 1991, The Journal of the Acoustical Society of America.

[12]  B. Moore,et al.  Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. , 1983, The Journal of the Acoustical Society of America.

[13]  S McAdams,et al.  Identification of concurrent harmonic and inharmonic vowels: a test of the theory of harmonic cancellation and enhancement. , 1995, The Journal of the Acoustical Society of America.

[14]  T. W. Parsons Separation of speech from interfering speech by means of harmonic selection , 1976 .

[15]  C. Darwin,et al.  Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. , 2003, The Journal of the Acoustical Society of America.

[16]  E. C. Cmm,et al.  on the Recognition of Speech, with , 2008 .

[17]  R D Patterson,et al.  The time course of auditory segregation: concurrent vowels that vary in duration. , 1995, The Journal of the Acoustical Society of America.

[18]  R. Carlyon,et al.  Excitation produced by Schroeder-phase complexes: evidence for fast-acting compression in the auditory system. , 1997, The Journal of the Acoustical Society of America.

[19]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[20]  S. G. Nooteboom,et al.  Intonation and the perceptual separation of simultaneous voices , 1982 .

[21]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[22]  R. Carlyon,et al.  Masking period patterns of Schroeder-phase complexes: effects of level, number of components, and phase of flanking components. , 1997, The Journal of the Acoustical Society of America.

[23]  Hideki Kawahara,et al.  Concurrent vowel identification. I. Effects of relative amplitude and F0 difference , 1997, The Journal of the Acoustical Society of America.

[24]  C. Darwin,et al.  Grouping in pitch perception: effects of onset asynchrony and ear of presentation of a mistuned component. , 1992, The Journal of the Acoustical Society of America.

[25]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[26]  Van Summers,et al.  Masking of tones and speech by Schroeder-phase harmonic complexes in normally hearing and hearing-impaired listeners , 1998, Hearing Research.

[27]  Q. Summerfield,et al.  Modeling the perception of concurrent vowels: vowels with different fundamental frequencies. , 1990, The Journal of the Acoustical Society of America.

[28]  E. C. Cherry Some Experiments on the Recognition of Speech, with One and with Two Ears , 1953 .

[29]  R Meddis,et al.  Modeling the identification of concurrent vowels with different fundamental frequencies. , 1992, The Journal of the Acoustical Society of America.

[30]  A. M. Mimpen,et al.  Speech-reception threshold for sentences as a function of age and noise level. , 1979, The Journal of the Acoustical Society of America.

[31]  J. Licklider,et al.  A duplex theory of pitch perception , 1951, Experientia.

[32]  Q Summerfield,et al.  The contribution of waveform interactions to the perception of concurrent vowels. , 1994, The Journal of the Acoustical Society of America.

[33]  John F. Culling,et al.  Periodicity of maskers not targets determines ease of perceptual segregation using differences in fundamental frequency , 1992 .

[34]  Alain de Cheveigné,et al.  Waveform interactions and the segregation of concurrent vowels , 1999 .

[35]  P. Peterson Simulating the response of multiple microphones to a single acoustic source in a reverberant room. , 1986, The Journal of the Acoustical Society of America.

[36]  J. Bird Effects of a difference in fundamental frequency in separating two sentences. , 1997 .

[37]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[38]  B C Moore,et al.  Difference limens for phase in normal and hearing-impaired subjects. , 1989, The Journal of the Acoustical Society of America.