Combined speech and audio coding by discrimination

We propose in this paper a general solution for combined speech and audio coding. Particularly, we describe a speech/music discrimination procedure for multi-mode wideband coding. The speech/music decision is updated only when a low-energy frame is detected, and kept unchanged otherwise. The signal is classified using second-order statistics of discriminant parameters. An experimental CELP/transform coder operating at 16 kbit/s is demonstrated. Results show improved performance when compared to single-mode encoding.

[1]  Roch Lefebvre,et al.  A wideband speech and audio codec at 16/24/32 kbit/s using hybrid ACELP/TCX techniques , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[2]  Peter Vary,et al.  Wideband speech coding using forward/backward adaptive prediction with mixed time/frequency domain excitation , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[3]  S. A. Ramprashad A multimode transform predictive coder (MTPC) for speech and audio , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[4]  M. Jelinek,et al.  Robust signal/noise discrimination for wideband speech and audio coding , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[5]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..