Sinusoidal Analysis-Synthesis of Audio Using Perceptual Criteria

This paper presents a new method for the selection of sinusoidal components for use in compact representations of narrowband audio. The method consists of ranking and selecting the most perceptually relevant sinusoids. The idea behind the method is to maximize the matching between the auditory excitation pattern associated with the original signal and the corresponding auditory excitation pattern associated with the modeled signal that is being represented by a small set of sinusoidal parameters. The proposed component-selection methodology is shown to outperform the maximum signal-to-mask ratio selection strategy in terms of subjective quality.

[1]  David V. Anderson Speech analysis and coding using a multi-resolution sinusoidal transform , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  A. Spanias,et al.  Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[3]  Jae S. Lim,et al.  Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[4]  R. J. McAulay,et al.  The sinusoidal transform coder at 2400 b/s , 1992, MILCOM 92 Conference Record.

[5]  Teresa H. Y. Meng,et al.  Transient Modeling Synthesis: a flexible analysis/synthesis tool for transient signals , 1998, ICMC.

[6]  Julius O. Smith,et al.  A Sines+Transients+Noise Audio Representation for Data Compression and Time/Pitch Scale Modifications , 1998 .

[7]  P. Laguna,et al.  Signal Processing , 2002, Yearbook of Medical Informatics.

[8]  B. Paillard,et al.  PERCEVAL: Perceptual Evaluation of the Quality of Audio Signals , 1992 .

[9]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[10]  Xavier Serra,et al.  A system for sound analysis/transformation/synthesis based on a deterministic plus stochastic decomposition , 1989 .