Adaptive time-segmentation for speech coding with limited delay

We investigate the trade-off between delay and signal quality in adaptive time-segmentation for speech coding. A variable rate sinusoidal coder with adaptive segmentation and bit allocations is proposed and implemented with specifiable look-ahead. Objective and subjective results indicate that adaptive time-segmentation is advantageous even with low delay (30 ms), and that quality only increases with the delay until approximately 100 ms.

[1]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .

[2]  Jesper Jensen,et al.  Jointly Optimal Time Segmentation, Distribution and Quantisation for Sinusoidal Audio Coding , 2005 .

[3]  Luís B. Almeida,et al.  Harmonic coding at 4.8 kb/s , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4]  Paolo Prandoni,et al.  R/D optimal linear prediction , 2000, IEEE Trans. Speech Audio Process..

[5]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[6]  Mark A. Clements,et al.  Sinusoidal modeling and modification of unvoiced speech , 1997, IEEE Trans. Speech Audio Process..

[7]  Amro El-Jaroudi,et al.  Discrete all-pole modeling , 1991, IEEE Trans. Signal Process..

[8]  Richard Heusdens,et al.  Adaptive Time-segmentation for Packet Loss Channels , 2005 .