It is well known that most laryngeal diseases and vocal fold pathologies cause significant changes in speech. Different procedures of clinical application for laryngeal examination exist, being all of them of invasive nature. In the evaluation of quality of speech, acoustic analysis of normal and pathological voices have become increasingly interesting to researchers in laryngology and speech pathologies because of its nonintrusive nature and its potential for providing quantitative data with reasonable analysis time. In this article, the implementation of a system for automatic detection of laryngeal pathologies using acoustic analysis of speech in the frequency domain is described. Different processing techniques of speech signal are applied: cepstrum, mel-cepstrum, delta cepstrum and delta mel-cepstrum, and FFT. The obtained data feed to neural networks, which classify the voice patterns. Two types of neural network were examined: a system trained to distinguish between normal and pathological voices (no matter the pathology); and a more complex system, trained to classify normal, bicyclic and rough voice. High percentages of recognition are obtained, being the cepstral analysis the processing technique that achieves the highest actings. This indicates that this analysis type provides a characterization of the voice in pathological condition in a direct and noninvasive way. The obtained results make promissory the application of this alternative as a support tool for the diagnosis of pathologies of the vocal system.
[1]
M.H. Hassoun,et al.
Fundamentals of Artificial Neural Networks
,
1996,
Proceedings of the IEEE.
[2]
Geoffrey E. Hinton,et al.
Phoneme recognition using time-delay neural networks
,
1989,
IEEE Trans. Acoust. Speech Signal Process..
[3]
L. Gavidia-Ceballos,et al.
A nonlinear operator-based speech feature analysis method with application to vocal fold pathology assessment
,
1998,
IEEE Transactions on Biomedical Engineering.
[4]
Stefan Hadjitodorov,et al.
ACOUSTIC ANALYSIS OF PATHOLOGICAL VOICES
,
1997
.
[5]
Jonathan G. Fiscus,et al.
DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1
,
1993
.
[6]
Jeffrey L. Elman,et al.
Finding Structure in Time
,
1990,
Cogn. Sci..
[7]
Judith A. Markowitz.
Using Speech Recognition
,
1995
.
[8]
B. Boyanov,et al.
Text-independent speaker identification using neural nets and AR-vector models
,
1994
.
[9]
B Boyanov,et al.
Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases.
,
1997,
IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.