Band-specific temporal periodicity enhancement for Cantonese tone perception with noise-excited vocoder

This paper describes a study on the effectiveness of expanding the temporal envelope and periodicity component (TEPC) for Cantonese tone perception. The ultimate goal is to develop speech processing techniques that can improve speech perception of hearing prosthesis users. Cantonese is a popular Chinese dialect with a complex lexical tone system. TEPCs are extracted from a few predefined frequency bands. A nonlinear expansion method is applied to increase the modulation depth of the TEPCs, in order to make the temporal periodicity information more salient. To simulate the speech processing procedures of a cochlear implant, the TEPC is used to modulate a noise carrier. Psychophysical listening tests on Cantonese lexical tone identification are carried out with expanded and unexpanded TEPCs. The experimental results show that: (1) expansion of TEPC from high-frequency band leads to noticeable improvement on tone identification accuracy; (2) the effectiveness of TEPC expansion is more significant for female voice than male voice; (3) the presence of noise carrier in low-frequency band negatively affects tone identification accuracy.

[1]  B. Moore An Introduction to the Psychology of Hearing , 1977 .

[2]  F. Zeng,et al.  Importance of tonal envelope cues in Chinese speech recognition. , 1998, The Journal of the Acoustical Society of America.

[3]  Bin Yang,et al.  The Relevance of Voice Quality Features in Speaker Independent Emotion Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  R V Shannon,et al.  Speech Recognition with Primarily Temporal Cues , 1995, Science.

[5]  Meng Yuan,et al.  Frequency-Specific Temporal Envelope and Periodicity Components for Lexical Tone Identification in Cantonese , 2007, Ear and hearing.

[6]  Johan Laneau When the Deaf Listen to Music - Pitch Perception in Cochlear Implants (Als doven naar muziek luisteren - Toonhoogtewaarneming met een cochleair implantaat) , 2005 .

[7]  Bryan E Pfingst,et al.  Features of stimulation affecting tonal-speech perception: implications for cochlear prostheses. , 2002, The Journal of the Acoustical Society of America.

[8]  P. Denes,et al.  The speech chain : the physics and biology of spoken language , 1963 .

[9]  R Drullman,et al.  Temporal envelope and fine structure cues for speech intelligibility. , 1994, The Journal of the Acoustical Society of America.

[10]  M. Dorman,et al.  Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. , 1997, The Journal of the Acoustical Society of America.

[11]  Fan-Gang Zeng,et al.  Mandarin tone recognition in cochlear-implant subjects , 2004, Hearing Research.

[12]  S. Rosen Temporal information in speech: acoustic, auditory and linguistic aspects. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[13]  Xin Luo,et al.  Importance of pitch and periodicity to Chinese-speaking cochlear implant patients , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Andrew J Oxenham,et al.  Across-frequency pitch discrimination interference between complex tones containing resolved harmonics. , 2007, The Journal of the Acoustical Society of America.