Dysphonia detected by pattern recognition of spectral composition.

The vowel [a:] in a test word, judged normal or dysphonic, was examined with the Self-Organizing Map; the artificial neural network algorithm of Kohonen. The algorithm produces two-dimensional representations (maps) of speech. Input to the acoustic maps consisted of 15-component spectral vectors calculated at 9.83-msec intervals from short-time power spectra. The male and female maps were first calculated from the speech of healthy subjects and then the [a:] samples (15 successive spectral vectors) were examined on the maps. The dysphonic voices deviated from the norm both in the composition of the short-time power spectra (characterized by the dislocation of the trajectory pattern on the map) and in the stability of the spectrum during the performance (characterized by the pattern of the trajectory on the map). Rough voices were distinguished from breathy ones by their patterns on the map. With the limited speech material, an index for the degree of pathology could not be determined. A self-organized acoustic map provides an on-line visual representation of voice and speech in an easily understandable form. The method is thus suitable not only for diagnostic but also for educational and therapeutic purposes.

[1]  F. Klingholz,et al.  Quantitative spectral evaluation of shimmer and jitter. , 1985, Journal of speech and hearing research.

[2]  Ronald J. Baken,et al.  Clinical measurement of speech and voice , 1987 .

[3]  R J Baken,et al.  Consideration of the relationship between the fundamental frequency of phonation and vocal jitter. , 1990, Folia phoniatrica.

[4]  M. Hirano,et al.  Clinical Examination of Voice , 1981 .

[5]  Teuvo Kohonen,et al.  The 'neural' phonetic typewriter , 1988, Computer.

[6]  Directional perturbation factors for jitter and for shimmer. , 1984, Journal of communication disorders.

[7]  M. Rontal,et al.  Quantitative and Objective Evaluation of Vocal Cord Function , 1983, The Annals of otology, rhinology, and laryngology.

[8]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[9]  F. Emanuel,et al.  Some waveform and spectral features of vowel roughness. , 1978, Journal of speech and hearing research.

[10]  D. Cramon,et al.  Acoustic measurement of voice quality in central dysphonia , 1984 .

[11]  D. Childers,et al.  Acoustic correlates of vocal quality. , 1990, Journal of speech and hearing research.

[12]  S. Iwata,et al.  Pitch perturbations in normal and pathologic voices. , 1970, Folia phoniatrica.

[13]  M. Hirano Objective evaluation of the human voice: clinical aspects. , 1989, Folia phoniatrica.

[14]  F Klingholz,et al.  Speech wave aperiodicities at sustained phonation in functional dysphonia. , 1983, Folia phoniatrica.

[15]  Mikko Kokkonen,et al.  Using self-organizing maps and multi-layered feed-forward nets to obtain phonemic transcriptions of spoken utterances , 1990, Speech Commun..

[16]  M O Hakumäki,et al.  Analysis of r and s disorders in Finnish by using a laboratory computer. , 1990, Folia phoniatrica.

[17]  B. Walden,et al.  An evaluation of residue features as correlates of voice disorders. , 1987, Journal of communication disorders.

[18]  S. Imaizumi Acoustic measures of roughness in pathological voice , 1986 .