Pitch and noise estimation in hoarse voices

In pathologic voices, both slow and fast pitch variations within an utterance are indicative of the patient status. Moreover, the spectrogram of such voices usually shows high noise components, closely related to the degree of perceived hoarseness of the voice. In the present paper, both pitch and noise variations are tracked during an utterance. This is accomplished by means of a two-step procedure for finding f0 , based on robust estimation approaches, which allows selecting the varying optimal time window for analysis. The Normalised Noise Energy method [ 1] is revisited and an adaptive version is applied on optimised signal windows. Empty "dip" regions are avoided and the method results applicable both to sustained vowels and to words. Simulations show the good performance of the proposed approach. Its application to real data allows the physician objectively tracking important voice parameters.