High-Quality Time Stretch and Pitch Shift Effects for Speech and Audio Using the Instantaneous Harmonic Analysis

The paper presents methods for instantaneous harmonic analysis with application to high-quality pitch, timbre, and time-scale modifications. The analysis technique is based on narrow-band filtering using special analysis filters with frequency-modulated impulse response. The main advantages of the technique are high accuracy of harmonic parameters estimation and adequate harmonic/noise separation that allow implementing audio and speech effects with low level of audible artifacts. Time stretch and pitch shift effects are considered as primary application in the paper.

[1]  Elias Azarov,et al.  INSTANTANEOUS HARMONIC ANALYSIS FOR VOCAL PROCESSING TEMPLATES FOR DAFX04, NAPLES, ITALY , 2009 .

[2]  Dennis Gabor,et al.  Theory of communication , 1946 .

[3]  Elias Azarov,et al.  Harmonic Representation and Auditory Model-Based Parametric Matching and Its Application in Speech/Audio Analysis , 2009 .

[4]  Thomas F. Quatieri,et al.  Speech transformations based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[5]  Luis Weruaga,et al.  The fan-chirp transform for non-stationary harmonic signals , 2007, Signal Process..

[6]  Petros Maragos,et al.  Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..

[7]  Boualem Boashash,et al.  Estimating and interpreting the instantaneous frequency of a signal. I. Fundamentals , 1992, Proc. IEEE.

[8]  Masaaki Honda,et al.  Sinusoidal model based on instantaneous frequency attractors , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Elias Azarov,et al.  Estimation of the instantaneous harmonic parameters of speech , 2008, 2008 16th European Signal Processing Conference.

[10]  Yannis Stylianou On the harmonic analysis of speech , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[11]  Yonghong Zeng,et al.  Harmonic transform , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[12]  Xavier Serra,et al.  Musical Sound Modeling with Sinusoids plus Noise , 1997 .

[13]  Andreas Spanias,et al.  Speech coding: a tutorial review , 1994, Proc. IEEE.

[14]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[15]  Takao Kobayashi,et al.  Harmonics tracking and pitch extraction based on instantaneous frequency , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[16]  Julius O. Smith,et al.  A Sines+Transients+Noise Audio Representation for Data Compression and Time/Pitch Scale Modifications , 1998 .