Accurate estimation of the glottal flow derivative using iteratively reweighted 1-norm minimization

The problem of estimating the exact shape of the glottal flow derivative (GFD) using reweighted 1-norm minimization of the second derivative of the GFD is addressed in this paper. By using physiological models of the glottal flow derivative, such as the Liljencrants-Fant (LF) and Rosenberg models, it is intuitively found that the second derivative of those models is highly sparse. Based on this observation an iteratively reweighted 1-norm minimization algorithm is proposed to accurately estimate the vocal tract of the speech signal by exploiting the sparsity of the second derivative of the GFD (the residual of the linear prediction model). An experimental study using a data set of 40 vowels /a/ and /e/, 20 for each, is conducted, showing the efficiency, in terms of the number of iterations and the total run-time reduction, of the proposed algorithm. Furthermore, the results of estimating the GFD of two vowels /a/ & /e/ using Joint Source-Filter Model Optimization and our proposed method, demonstrate the accuracy, in terms of similarity to the physiological model and precise synthesis, of our proposed algorithm.

[1]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[2]  Marc Moonen,et al.  Enhancing sparsity in linear prediction of speech by iteratively reweighted 1-norm minimization , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Marc Moonen,et al.  Speech coding based on sparse linear prediction , 2009, 2009 17th European Signal Processing Conference.

[4]  Helmer Strik,et al.  Automatic parametrization of differentiated glottal flow: Comparing methods by means of synthetic flow pulses , 1998 .

[5]  Mohammad Hasan Savoji,et al.  A new iterative algorithm for estimating the glottal flow derivative of vowels , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[6]  A. Rosenberg Effect of glottal pulse shape on the quality of natural vowels. , 1969, The Journal of the Acoustical Society of America.

[7]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[8]  Douglas A. Reynolds,et al.  Modeling of the glottal flow derivative waveform with application to speaker identification , 1999, IEEE Trans. Speech Audio Process..

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Marc Moonen,et al.  Joint estimation of short-term and long-term predictors in speech coders , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Qiang Fu,et al.  Robust Glottal Source Estimation Based on Joint Source-Filter Model Optimization , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Petre Stoica,et al.  Spectral Analysis of Signals , 2009 .

[13]  Marc Moonen,et al.  Sparse linear predictors for speech processing , 2008, INTERSPEECH.

[14]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.