Wideband speech coding with toll quality based on IA-model

We propose an instantaneous amplitude (IA) based model for speech signal representation. This can avoid the difficulty in dealing with the time-varying phases and allows us to perform an optimization procedure easily such that the synthetic signal can be made as close to the original one as possible. A simplified frequency picking algorithm is derived to shorten the processing time while still maintaining the quality of the synthetic speech. Experiments show that the synthetic speech with the developed technique is of toll quality and almost perceptually indistinguishable from the original speech. Initial work on the coding of the parameters, for a 16 kHz sampled speech, for the IA model is done and a toll quality synthesized speech at a bit rate of 40 kbps is achieved.

[1]  M. Portnoff Short-time Fourier analysis of sampled speech , 1981 .

[2]  J. L. Flanagan,et al.  PHASE VOCODER , 2008 .

[3]  Per Hedelin A tone oriented voice excited vocoder , 1981, ICASSP.

[4]  Luís B. Almeida,et al.  Harmonic coding at 4.8 kb/s , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[5]  Gang Li,et al.  Speech analysis and synthesis using instantaneous amplitudes , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[6]  Luís B. Almeida,et al.  New basis functions for sinusoidal decompositions , 1988, 8th European Conference on Electrotechnics, Conference Proceedings on Area Communication.

[7]  Luís B. Almeida,et al.  Variable-frequency synthesis: An improved harmonic coding scheme , 1984, ICASSP.

[8]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[9]  David Malah,et al.  Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals , 1979 .