Frequency-Specific Temporal Envelope and Periodicity Components for Lexical Tone Identification in Cantonese

Objectives: Temporal envelope and periodicity components (TEPC) in the speech signal have potentials to offer important cues for speech recognition especially in tonal languages. The aims of this study are: (i) to investigate the degree of contributions of TEPC to lexical tone identification in Cantonese; and (ii) to investigate whether or not the contributions vary among different frequency bands. The results of these investigations would reveal if there are any frequency-specific TEPC that are important for lexical tone identification. Design: TEPC of monosyllable words carrying different lexical tones, were extracted by the method of full-wave rectification and low-pass filtering. They were used to modulate a speech spectrum noise to create the test stimuli. Thus the stimuli contain only temporal envelope and periodicity components but no temporal fine structures of the original speech signal. Multiple sets of stimuli were created with different combinations of TEPC modulated frequency bands, Eighteen adult subjects with normal hearing participated in the study. Results: Lexical tone identification was the best when only the TEPC from the two high frequency bands (1–2 kHz and 2–4 kHz) of the original signal were provided, but the worst when only the TEPC from the two low frequency bands (60–500 Hz and 500–1000 Hz) were provided. The findings suggested that high frequency bands are carrying TEPC which are important for lexical-tone identification. Lexical tone identification performance was better for the male stimuli than the female ones. Conclusions: The results indicate the potential on improving speech recognition in tonal languages by manipulating TEPC via new signal processing algorithms in hearing prosthesis.

[1]  Michael K. Qin,et al.  Effects of Envelope-Vocoder Processing on F0 Discrimination and Concurrent-Vowel Identification , 2005, Ear and hearing.

[2]  Bryan E Pfingst,et al.  Features of stimulation affecting tonal-speech perception: implications for cochlear prostheses. , 2002, The Journal of the Acoustical Society of America.

[3]  M. Dorman,et al.  Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. , 1997, The Journal of the Acoustical Society of America.

[4]  Fan-Gang Zeng,et al.  Mandarin tone recognition in cochlear-implant subjects , 2004, Hearing Research.

[5]  T. Vance,et al.  Tonal Distinctions in Cantonese , 1977, Phonetica.

[6]  Yuen-Yuen Fok Chan A perceptual study of tones in Cantonese , 1974 .

[7]  Steven Greenberg,et al.  Speech intelligibility derived from exceedingly sparse spectral information , 1998, ICSLP.

[8]  R Plomp,et al.  The negative effect of amplitude compression in multichannel hearing aids in the light of the modulation-transfer function. , 1988, The Journal of the Acoustical Society of America.

[9]  Emily Buss,et al.  Temporal Fine-Structure Cues to Speech and Pure Tone Modulation in Observers with Sensorineural Hearing Loss , 2004, Ear and hearing.

[10]  F. Zeng,et al.  Importance of tonal envelope cues in Chinese speech recognition. , 1998, The Journal of the Acoustical Society of America.

[11]  Robert V. Shannon,et al.  Holes in Hearing , 2002, Journal of the Association for Research in Otolaryngology.

[12]  Kohlrausch,et al.  The influence of carrier level and frequency on modulation and beat-detection thresholds for sinusoidal carriers , 2000, The Journal of the Acoustical Society of America.

[13]  S. Rosen Temporal information in speech: acoustic, auditory and linguistic aspects. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[14]  P. Boersma Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[15]  S. Bacon,et al.  Relative importance of temporal information in various frequency regions for consonant identification in quiet and in noise. , 2004, The Journal of the Acoustical Society of America.

[16]  Brent Edwards,et al.  Hearing Aids and Hearing Impairment , 2004 .

[17]  D H Whalen,et al.  Information for Mandarin Tones in the Amplitude Contour and in Brief Segments , 1990, Phonetica.

[18]  Philipos C Loizou,et al.  The intelligibility of speech with "holes" in the spectrum. , 2002, The Journal of the Acoustical Society of America.

[19]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[20]  Robert S. Bauer,et al.  Modern Cantonese Phonology , 1997 .