A novel codebook search technique for estimating the open quotient

The open quotient (OQ), loosely defined as the proportion of time the glottis is open during phonation, is an important parameter in many source models. Accurate estimation of OQ from acoustic signals is a non-trivial process as it involves the separation of the source signal from the vocal-tract transfer function. Often this process is hampered by the lack of direct physiological data with which to calibrate algorithms. In this paper, an analysis-by-synthesis method using a codebook of harmonically-based Liljencrants-Fant (LF) source models in conjunction with a constrained optimizer was used to obtain estimates of OQ from four subjects. The estimates were compared with physiological measurements from high-speed imaging. Results showed relatively high correlations between the estimated and measured values for only two of the speakers, suggesting that existing source models may be unable to accurately represent some source signals.

[1]  J.O. Smith,et al.  Joint estimation of glottal source and vocal tract for vocal synthesis using Kalman smoothing and EM algorithm , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[2]  Elliot Moore,et al.  Algorithm for automatic glottal waveform estimation without the reliance on precise glottal closure information , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Christophe d'Alessandro,et al.  Spectral correlates of voice open quotient and glottal flow asymmetry : theory, limits and experimental data , 2001, INTERSPEECH.

[4]  Arantza del Pozo,et al.  The linear transformation of LF glottal waveforms for voice conversion , 2008, INTERSPEECH.

[5]  J. Liljencrants,et al.  Dept. for Speech, Music and Hearing Quarterly Progress and Status Report a Four-parameter Model of Glottal Flow , 2022 .

[6]  Abeer Alwan,et al.  The relationship between open quotient and H1*-H2*. , 2008 .

[7]  H. Strube,et al.  SIM--simultaneous inverse filtering and matching of a glottal flow model for acoustic speech signals. , 2001, The Journal of the Acoustical Society of America.

[8]  J W Hawks,et al.  A formant bandwidth estimation procedure for vowel synthesis [43.72.Ja]. , 1995, The Journal of the Acoustical Society of America.

[9]  Roy D. Patterson,et al.  An instantaneous-frequency-based pitch extraction method for high-quality speech transformation: revised TEMPO in the STRAIGHT-suite , 1998, ICSLP.

[10]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[11]  J. Perkell,et al.  Comparisons among aerodynamic, electroglottographic, and acoustic spectral measures of female voice. , 1995, Journal of speech and hearing research.