A psychoacoustic method for studying the necessary and sufficient perceptual cues of American English fricative consonants in noise.

In a previous study on plosives, the 3-Dimensional Deep Search (3DDS) method for the exploration of the necessary and sufficient cues for speech perception was introduced (Li et al., (2010). J. Acoust. Soc. Am. 127(4), 2599-2610). Here, this method is used to isolate the spectral cue regions for perception of the American English fricatives /∫, 3, s, z, f, v, θ, δ in time, frequency, and intensity. The fricatives are analyzed in the context of consonant-vowel utterances, using the vowel /α/. The necessary cues were found to be contained in the frication noise for /∫, 3, s, z, f, v/. 3DDS analysis isolated the cue regions of /s, z/ between 3.6 and 8 [kHz] and /∫, 3/ between 1.4 and 4.2 [kHz]. Some utterances were found to contain acoustic components that were unnecessary for correct perception, but caused listeners to hear non-target consonants when the primary cue region was removed; such acoustic components are labeled "conflicting cue regions." The amplitude modulation of the high-frequency frication region by the fundamental F0 was found to be a sufficient cue for voicing. Overall, the 3DDS method allows one to analyze the effects of natural speech components without initial assumptions about where perceptual cues lie in time-frequency space or which elements of production they correspond to.

[1]  Jont B. Allen,et al.  The influence of stop consonants' perceptual features on the Articulation Index model. , 2012, The Journal of the Acoustical Society of America.

[2]  Jont B Allen,et al.  Perceptual effects of plosive feature modification. , 2012, The Journal of the Acoustical Society of America.

[3]  Torsten Dau,et al.  Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing. , 2011, The Journal of the Acoustical Society of America.

[4]  Mark Hasegawa-Johnson,et al.  Intelligibility predictors and neural representation of speech , 2011, Speech Commun..

[5]  Jont B. Allen,et al.  Manipulation of Consonants in Natural Speech , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[7]  Jont B. Allen,et al.  A psychoacoustic method to find the perceptual cues of stop consonants in natural speech. , 2010, The Journal of the Acoustical Society of America.

[8]  Jont B. Allen,et al.  Multiband product rule and consonant identification. , 2009, The Journal of the Acoustical Society of America.

[9]  Bryce E. Lobdell Models of Human Phone Transcription in Noise Based on Intelligibility Predictors , 2009 .

[10]  Jont B. Allen,et al.  Consonant confusions in white noise. , 2008, The Journal of the Acoustical Society of America.

[11]  Marion S Régnier,et al.  A method to identify noise-robust perceptual features: application for consonant /t/. , 2008, The Journal of the Acoustical Society of America.

[12]  Nima Mesgarani,et al.  Phoneme representation and classification in primary auditory cortex. , 2008, The Journal of the Acoustical Society of America.

[13]  A. Jongman,et al.  Perception of clear fricatives by normal-hearing and simulated hearing-impaired listeners. , 2008, The Journal of the Acoustical Society of America.

[14]  Jont B. Allen Consonant recognition and the articulation index. , 2005, Journal of the Acoustical Society of America.

[15]  A. Jongman,et al.  Acoustic characteristics of English fricatives. , 2000, The Journal of the Acoustical Society of America.

[16]  Christine H. Shadle,et al.  Quantifying spectral characteristics of fricatives , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[17]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[18]  Jont B. Allen,et al.  How do humans process and recognize speech? , 1993, IEEE Trans. Speech Audio Process..

[19]  R. N. Ohde,et al.  Effect of relative amplitude of frication on perception of place of articulation. , 1991, The Journal of the Acoustical Society of America.

[20]  S. Blumstein,et al.  Acoustic and perceptual characteristics of voicing in fricatives and fricative clusters. , 1992, The Journal of the Acoustical Society of America.

[21]  D. Whalen,et al.  Perception of the English /s/–/∫/ distinction relies on fricative noises and transitions, not on brief spectral slices , 1991 .

[22]  A. Jongman Duration of frication noise required for identification of English fricatives. , 1989, The Journal of the Acoustical Society of America.

[23]  S. Blumstein,et al.  On the role of the amplitude of the fricative noise in the perception of place of articulation in voiceless fricative consonants. , 1988, The Journal of the Acoustical Society of America.

[24]  S R Baum,et al.  Preliminary observations on the use of duration as a cue to syllable-initial fricative consonant voicing in English. , 1987, The Journal of the Acoustical Society of America.

[25]  S. Furui On the role of spectral transition for speech perception. , 1986, The Journal of the Acoustical Society of America.

[26]  S. Soli Second formants in fricatives: Acoustic consequences of fricative‐vowel coarticulation , 1981 .

[27]  D. Pisoni,et al.  Speech perception without traditional speech cues. , 1981, Science.

[28]  D. Whalen Effects of vocalic formant transitions and vowel quality on the English [s]-[ŝ] boundary. , 1981, The Journal of the Acoustical Society of America.

[29]  S. Blumstein,et al.  Invariant cues for place of articulation in stop consonants. , 1978, The Journal of the Acoustical Society of America.

[30]  S. Blumstein,et al.  Property detectors for bursts and transitions in speech perception. , 1975, The Journal of the Acoustical Society of America.

[31]  M. D. Wang,et al.  Consonant confusions in noise: a study of perceptual features. , 1973, The Journal of the Acoustical Society of America.

[32]  K. Stevens,et al.  Reduction of Speech Spectra by Analysis‐by‐Synthesis Techniques , 1961 .

[33]  K. Stevens,et al.  On the Properties of Voiceless Fricative Consonants , 1961 .

[34]  K. Harris Cues for the Discrimination of American English Fricatives in Spoken Syllables , 1958 .

[35]  A. Liberman,et al.  Some Cues for the Distinction Between Voiced and Voiceless Stops in Initial Position , 1957 .

[36]  G. W. Hughes,et al.  Spectral Properties of Fricative Consonants , 1956 .

[37]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[38]  A. Liberman,et al.  Acoustic Loci and Transitional Cues for Consonants , 1954 .

[39]  A. Liberman,et al.  Some Experiments on the Perception of Synthetic Speech Sounds , 1952 .

[40]  Harvey Fletcher,et al.  Errata: The Perception of Speech and Its Relation to Telephony [J. Acous. Soc. Am. 22, 89 (1950)] , 1950 .

[41]  H. Fletcher,et al.  The Perception of Speech and Its Relation to Telephony , 1950 .