论文信息 - Instantaneous Harmonic Analysis: Techniques and Applications to Speech Signal Processing

Instantaneous Harmonic Analysis: Techniques and Applications to Speech Signal Processing

Parametric speech modeling is a key issue in various processing applications such as text to speech synthesis, voice morphing, voice conversion and other. Building an adequate parametric model is a complicated problem considering time-varying nature of speech. This paper gives an overview of tools for instantaneous harmonic analysis and shows how it can be applied to stationary, frequency-modulated and quasiperiodic signals in order to extract and manipulate instantaneous pitch, excitation and spectrum envelope.

Elias Azarov | Alexander A. Petrovsky

[1] Eric Moulines,et al. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[2] Julius O. Smith,et al. A Sines+Transients+Noise Audio Representation for Data Compression and Time/Pitch Scale Modifications , 1998 .

[3] Elias Azarov,et al. Guslar: A framework for automated singing voice correction , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4] Masaaki Honda,et al. Sinusoidal model based on instantaneous frequency attractors , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[5] J. F. Kaiser,et al. On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6] Elias Azarov,et al. Instantaneous harmonic representation of speech using multicomponent sinusoidal excitation , 2013, INTERSPEECH.

[7] Petros Maragos,et al. Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..

[8] Hideki Kawahara,et al. Development of exploratory research tools based on TANDEM-STRAIGHT , 2009 .

[9] Elias Azarov,et al. Instantaneous pitch estimation based on RAPT framework , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[10] W. Bastiaan Kleijn,et al. A Canonical Representation of Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11] Eric Moulines,et al. HNS: Speech modification based on a harmonic+noise model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[13] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[14] J. L. Flanagan,et al. PHASE VOCODER , 2008 .

[15] Elias Azarov,et al. Linear prediction of deterministic components in hybrid signal representation , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.