Phase equalization-based autoregressive model of speech signals

This paper presents a novel method for estimating a vocal-tract spectrum from speech signals, based on a modeling of excitation signals of voiced speech. A formulation of linear prediction coding with impulse train is derived and applied to the phaseequalized speech signals, which are converted from the original speech signals by phase equalization. Preliminary results show that the proposed method improves the robustness of the estimation of a vocal-tract spectrum and the quality of re-synthesized speech compared with the conventional method. This technique will be useful for speech coding, speech synthesis, and real-time speech conversion. Index Terms: LPC, vocal-tract spectrum, phase equalization

[1]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[2]  Keiichi Funaki,et al.  Recursive ARMAX speech analysis based on a glottal source model with phase compensation , 1999, Signal Process..

[3]  Hideki Kawahara,et al.  Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  K. Munhall,et al.  Compensation following real-time manipulation of formants in isolated vowels. , 2006, The Journal of the Acoustical Society of America.

[5]  Takehiro Moriya,et al.  Speech coder using phase equalization and vector quantization , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  M. Mathews,et al.  Pitch Synchronous Analysis of Voiced Sounds , 1961 .

[7]  Masaaki Honda Speech coding using waveform matching based on LPC residual phase equalization , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8]  Alan McCree,et al.  A 4 kb/s hybrid MELP/CELP coder with alignment phase encoding and zero-phase equalization , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  J. Perkell,et al.  Sensorimotor adaptation to feedback perturbations of vowel acoustics and its relation to perception. , 2007, The Journal of the Acoustical Society of America.

[10]  Kazuyo Tanaka,et al.  Glottal excitation modeling using HMM with application to robust analysis of speech signal , 2000, INTERSPEECH.

[11]  Amro El-Jaroudi,et al.  Discrete all-pole modeling , 1991, IEEE Trans. Signal Process..

[12]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[13]  Takao Kaneko,et al.  An LPC vocoder based on phase-equalized pitch waveform , 2003, Speech Commun..