High Resolution Spherical Quantization of Sinusoids with Harmonically Related Frequencies

Sinusoidal coding is an essential tool in low-rate audio coding, and developing an efficient quantization scheme for the sinusoidal parameters is therefore crucial. In this work we derive optimal entropy constrained amplitude, phase and frequency quantizers for sinusoids whose frequencies are harmonically related, with respect to the l2 distortion measure. This scheme exploits the harmonic structure of many speech and audio signals in the sense that besides amplitudes and phases, only fundamental frequencies need to be quantized, resulting in a significant decrease in the number of bits assigned to frequency parameters. The asymptotically optimal quantizers minimize a high-resolution approximation of the expected l2 distortion while the corresponding quantization indices satisfy an entropy constraint. The quantizers turn out to be flexible and of low complexity, in the sense that they can be determined easily for varying bit rate requirements, without any sort of retraining or iterative procedures. In an objective rate-distortion comparison, the proposed scheme is shown to outperform two variants of a recently proposed scheme, in which all frequency parameters are quantized separately, either directly or differentially

[1]  R. Heusdens,et al.  Rate-distortion optimal high-resolution differential quantisation for sinusoidal coding of audio and speech , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[2]  Pim Korten,et al.  High rate spherical quantization of sinusoidal parameters , 2004, 2004 12th European Signal Processing Conference.

[3]  Heiko Purnhagen Advances in parametric audio coding , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[4]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[5]  Rémi Gribonval,et al.  Harmonic decomposition of audio signals with matching pursuit , 2003, IEEE Trans. Signal Process..

[6]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[7]  Stephen G. Wilson,et al.  Magnitude/Phase Quantization of Independent Gaussian Variates , 1980, IEEE Trans. Commun..

[8]  Teresa H. Y. Meng,et al.  A 6Kbps to 85Kbps scalable audio coder , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Heiko Purnhagen,et al.  HILN-the MPEG-4 parametric audio coding tools , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[10]  Alain de Cheveigné,et al.  F_0 estimation of one or several voices , 2003, INTERSPEECH.

[11]  Kon Max Wong,et al.  Detection of harmonic sets , 1995, IEEE Trans. Signal Process..

[12]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[13]  W. Bastiaan Kleijn,et al.  Entropy-constrained polar quantization and its application to audio coding , 2005, IEEE Transactions on Speech and Audio Processing.