Performance of speaker-dependent wideband speech coding

This paper examines the performance gains available in wideband speech coding using speaker-dependent systems. It is shown that a performance gain of 4 bits per frame, in the rate-distortion sense, is achievable in the LSF coding. While variations are evident in the pitch lag statistics during voiced frames, there is no gain to be had in unvoiced frames or in the adaptive gains; thus, there is little benefit to speaker-dependent coding of adaptive codebook parameters. Lastly, it was shown that gains of 40-50 bits per frame are available in the fixed excitation. These performance boosts can be exploited in a number of ways, most simply by reducing the operating rate. Alternatively, the complexity of the coding systems can be reduced while maintaining the same performance of speaker-independent coding. It was shown that a reduction in complexity by a factor of 4 is achievable using speaker-dependent LSF quantization.

[1]  Henry D. Pfister,et al.  Speaker dependent speech compression for low bandwidth communication , 1996, 1996 IEEE Aerospace Applications Conference. Proceedings.

[2]  Sangwon Kang,et al.  Safety-net pyramid VQ of LSF parameters for wideband speech codecs , 2001 .

[3]  A. Lakaniemi,et al.  A novel pitch-lag search method using adaptive weighting and median filtering , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[4]  Ethan Robert Duni High-rate optimized quantization structures and speaker- dependent wideband speech coding , 2007 .

[5]  Bhaskar D. Rao,et al.  PDF optimized parametric vector quantization of speech line spectral frequencies , 2003, IEEE Trans. Speech Audio Process..

[6]  Kai-Fu Lee,et al.  On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.