Automatic pre-segmentation of running speech improves the robustness of several acoustic voice measures

In order to study vocal loading, we developed a speech analysis environment for continuous speech. The objective was to build a robust system capable of handling large amounts of data while minimizing the amount of user-intervention required. The current version of the system can analyze up to five-minute recordings of speech at a time. Through a semiautomatic process it will classify a speech signal into segments of silence, voiced speech and unvoiced speech. Parameters extracted from the input signal include fundamental frequency, sound pressure level, alpha-ratio and speech segment information such as the ratio of speech to silence. This paper presents results from the performance evaluation of the system, which shows that the analysis environment is able to perform robust and consistent measurements of continuous speech.

[1]  E Sala,et al.  Effects of prolonged oral reading on time-based glottal flow waveform parameters with special reference to gender differences. , 1997, Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics.

[2]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[3]  Y. Qi,et al.  The estimation of signal-to-noise ratio in continuous speech for disordered voices. , 1999, The Journal of the Acoustical Society of America.

[4]  K. Harris,et al.  Laryngeal function in phonation and respiration , 1987 .

[5]  Paavo Alku,et al.  Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering , 1991, Speech Commun..

[6]  Peter Kitzing,et al.  LTAS criteria pertinent to the measurement of voice quality , 1986 .

[7]  Erkki Vilkman,et al.  Practical arrangements and methods in the field examination and speaking style analysis of professional voice users , 1994 .

[8]  H. Akaike A new look at the statistical model identification , 1974 .

[9]  Paavo Alku,et al.  An Analysis Environment for Studying Effects of Vocal Loading from Continuous Speech , 2001 .

[10]  D. Mayne,et al.  On the solution of singular value inequalities over a continuum of frequencies , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[11]  F. Klingholtz Acoustic recognition of voice disorders: a comparative study of running speech versus sustained vowels. , 1990, The Journal of the Acoustical Society of America.

[12]  Charles P. Schmidt,et al.  Effects of prolonged loud reading on selected measures of vocal function in trained and untrained singers , 1991 .

[13]  E. Vilkman Occupational risk factors and voice disorders. , 1996, Logopedics, phoniatrics, vocology.

[14]  Gunnar Fant,et al.  The voice source in connected speech , 1997, Speech Commun..