Glottal inverse filtering using stabilised weighted linear prediction

This paper presents and evaluates an inverse filtering technique of the speech signal which is based on the Stabilized Weighted Linear Prediction (SWLP) of speech [1]. SWLP emphasizes the speech samples that fit the underlying speech production model well, by imposing temporal weighting of the square of the residual signal. The performance of SWLP is compared to the conventional Linear Prediction based inverse filtering techniques, such as the Autocorrelation and Closed Phase Covariance method. All the inverse filtering approaches are evaluated on a database of speech signals generated by a physical model of the voice production system. Results show that the estimated glottal flows using SWLP are closer to the original glottal flow than those estimated by the Autocorrelation approach, while its performance is comparable to the Closed Phase Covariance approach.

[1]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[2]  Yves Kamp,et al.  Robust signal selection for linear prediction analysis of voiced speech , 1993, Speech Commun..

[3]  R. Miller Nature of the Vocal Cord Wave , 1956 .

[4]  Douglas A. Reynolds,et al.  Modeling of the glottal flow derivative waveform with application to speaker identification , 1999, IEEE Trans. Speech Audio Process..

[5]  Juan Carlos,et al.  Review of "Discrete-Time Speech Signal Processing - Principles and Practice", by Thomas Quatieri, Prentice-Hall, 2001 , 2003 .

[6]  D. Veeneman,et al.  Automatic glottal inverse filtering from speech and electroglottographic signals , 1985, IEEE Trans. Acoust. Speech Signal Process..

[7]  Paavo Alku,et al.  Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering , 1991, Speech Commun..

[8]  Paavo Alku,et al.  Stabilised weighted linear prediction , 2009, Speech Commun..

[9]  I R Titze,et al.  Comparison between electroglottography and electromagnetic glottography. , 2000, The Journal of the Acoustical Society of America.

[10]  I. Titze,et al.  Rules for controlling low-dimensional vocal fold models with muscle activation. , 2002, The Journal of the Acoustical Society of America.

[11]  I R Titze,et al.  Vocal intensity in speakers and singers. , 1991, The Journal of the Acoustical Society of America.

[12]  Paul H. Milenkovic,et al.  Glottal inverse filtering by joint estimation of an AR system with a linear input model , 1986, IEEE Trans. Acoust. Speech Signal Process..