Classification of functional voice disorders based on phonovibrograms

OBJECTIVE This work presents a computer-aided method for automatically and objectively classifying individuals with healthy and dysfunctional vocal fold vibration patterns as depicted in clinical high-speed (HS) videos of the larynx. METHODS By employing a specialized image segmentation and vocal fold movement visualization technique - namely phonovibrography - a novel set of numerical features is derived from laryngeal HS videos capturing the dynamic behavior and the symmetry of oscillating vocal folds. In order to assess the discriminatory power of the features, a support vector machine is applied to the preprocessed data with regard to clinically relevant diagnostic tasks. Finally, the classification performance of the learned nonlinear models is evaluated to allow for conclusions to be drawn about suitability of features and data resulting from different examination paradigms. As a reference, a second feature set is determined which corresponds to more traditional voice analysis approaches. RESULTS For the first time an automatic classification of healthy and pathological voices could be obtained by analyzing the vibratory patterns of vocal folds using phonovibrograms (PVGs). An average classification accuracy of approximately 81% was achieved for 2-class discrimination with PVG features. This exceeds the results obtained through traditional voice analysis features. Furthermore, a relevant influence of phonation frequency on classification accuracy was substantiated by the clinical HS data. CONCLUSION The PVG feature extraction and classification approach can be assessed as being promising with regard to the diagnosis of functional voice disorders. The obtained results indicate that an objective analysis of dysfunctional vocal fold vibration can be achieved with considerably high accuracy. Moreover, the PVG classification method holds a lot of potential when it comes to the clinical assessment of voice pathologies in general, as the diagnostic support can be provided to the voice clinician in a timely and reliable manner. Due to the observed interdependency between phonation frequency and classification accuracy, in future comparative studies of HS recordings of oscillating vocal folds homogeneous frequencies should be taken into account during examination.

[1]  H Nichol,et al.  Diagnostic criteria in functional dysphonia , 1986, The Laryngoscope.

[2]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[3]  Max A. Little,et al.  Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection , 2007 .

[4]  Jack J Jiang,et al.  Parameter estimation of an asymmetric vocal-fold system from glottal area time series using chaos synchronization. , 2006, Chaos.

[5]  U. Hoppe,et al.  Vocal fold vibration irregularities caused by different types of laryngeal asymmetry , 2003, European Archives of Oto-Rhino-Laryngology.

[6]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[7]  Steven Bielamowicz,et al.  Relationship among glottal area, static supraglottic compression, and laryngeal function studies in unilateral vocal fold paresis and paralysis. , 2004, Journal of voice : official journal of the Voice Foundation.

[8]  W. Ziegler,et al.  Lehrbuch der Phoniatrie und Pädaudiologie , 2005 .

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  N. Roy,et al.  Acoustic prediction of voice type in women with functional dysphonia. , 2005, Journal of voice : official journal of the Voice Foundation.

[11]  B. Doval,et al.  Glottal open quotient in singing: measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency. , 2005, The Journal of the Acoustical Society of America.

[12]  R Luchsinger,et al.  [Electroglottography and slow-motion films of the larynx. Comparison of results]. , 1970, Folia phoniatrica.

[13]  I R Titze,et al.  Irregular vocal-fold vibration--high-speed observation and modeling. , 2000, The Journal of the Acoustical Society of America.

[14]  Philip C Doyle,et al.  Classification of dysphonic voice: acoustic and auditory-perceptual measures. , 2005, Journal of voice : official journal of the Voice Foundation.

[15]  Hui Zhang,et al.  An integrated scheme for feature selection and parameter setting in the support vector machine modeling and its application to the prediction of pharmacokinetic properties of drugs , 2009, Artif. Intell. Medicine.

[16]  Antanas Verikas,et al.  Towards a computer-aided diagnosis system for vocal cord diseases , 2006, Artif. Intell. Medicine.

[17]  Melda Kunduk,et al.  Variability of normal vocal fold dynamics for different vocal loading in one healthy subject investigated by phonovibrograms. , 2009, Journal of voice : official journal of the Voice Foundation.

[18]  C. Dromey,et al.  Approximations of open quotient and speed quotient from glottal airflow and EGG waveforms: effects of measurement criteria and sound pressure level. , 1998, Journal of voice : official journal of the Voice Foundation.

[19]  Terri Treman Gerlach,et al.  Clinical Implementation of Laryngeal High-Speed Videoendoscopy: Challenges and Evolution , 2007, Folia Phoniatrica et Logopaedica.

[20]  Donald G. Childers,et al.  Electroglottography for Laryngeal Function Assessment and Speech Analysis , 1984, IEEE Transactions on Biomedical Engineering.

[21]  Eberhard Seifert,et al.  Stress and distress in non-organic voice disorder. , 2005, Swiss medical weekly.

[22]  H. K. Schutte,et al.  Videokymography in Voice Disorders: What to Look For? , 2007, The Annals of otology, rhinology, and laryngology.

[23]  Frank Rosanowski,et al.  Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos , 2007, Medical Image Anal..

[24]  U Hoppe,et al.  [Mechanisms of hoarseness -- visualization and interpretation by means of nonlinear dynamics]. , 2002, Laryngo- rhino- otologie.

[25]  C. Van Michel,et al.  Electroglottographie et cinématographie laryngée ultra-rapide , 1970 .

[26]  Y Lebrun,et al.  [Videostroboscopy of the larynx]. , 1986, Acta oto-rhino-laryngologica Belgica.

[27]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[28]  Dimitar D Deliyski,et al.  Regression Tree Approach to Studying Factors Influencing Acoustic Voice Analysis , 2006, Folia Phoniatrica et Logopaedica.

[29]  Seyed Kamaledin Setarehdan,et al.  Support vector machine-based arrhythmia classification using reduced features of heart rate variability signal , 2008, Artif. Intell. Medicine.

[30]  Eiji Yumoto,et al.  Aerodynamics, voice quality, and laryngeal image analysis of normal and pathologic voices , 2004, Current opinion in otolaryngology & head and neck surgery.

[31]  Roland Linder,et al.  Artificial neural network-based classification to screen for dysphonia using psychoacoustic scaling of acoustic voice features. , 2008, Journal of voice : official journal of the Voice Foundation.

[32]  A. Olthoff,et al.  Stroboscopy Versus High‐Speed Glottography: A Comparative Study , 2007, The Laryngoscope.

[33]  R J Baken,et al.  Consideration of the relationship between the fundamental frequency of phonation and vocal jitter. , 1990, Folia phoniatrica.

[34]  C. Lazarus,et al.  Current and emerging concepts in muscle tension dysphonia: a 30-month review. , 2005, Journal of voice : official journal of the Voice Foundation.

[35]  M Döllinger,et al.  High-speed video analysis of the phonation onset, with an application to the diagnosis of functional dysphonias. , 2008, Medical engineering & physics.

[36]  Melda Kunduk,et al.  Analysis of vocal-fold vibrations from high-speed laryngeal images using a Hilbert transform-based methodology. , 2005, Journal of voice : official journal of the Voice Foundation.

[37]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[38]  G. Castellanos,et al.  Kernel Principal Component Analysis through Time for Voice Disorder Classification , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[39]  Qilian Yu,et al.  An Automatic Method to Quantify the Vibration Properties of Human Vocal Folds via Videokymography , 2003, Folia Phoniatrica et Logopaedica.

[40]  Michael Döllinger,et al.  Spatiotemporal classification of vocal fold dynamics by a multimass model comprising time-dependent parameters. , 2008, The Journal of the Acoustical Society of America.

[41]  Michael Döllinger,et al.  Phonovibrography: Mapping High-Speed Movies of Vocal Fold Vibrations Into 2-D Diagrams for Visualizing and Analyzing the Underlying Laryngeal Dynamics , 2008, IEEE Transactions on Medical Imaging.

[42]  H. K. Schutte,et al.  Videokymography: high-speed line scanning of vocal fold vibration. , 1996, Journal of voice : official journal of the Voice Foundation.

[43]  P. Dejonckere,et al.  A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques , 2001, European Archives of Oto-Rhino-Laryngology.

[44]  Ulrich Eysholdt,et al.  The Pitch Rise Paradigm: A New Task for Real-Time Endoscopy of Non-Stationary Phonation , 2006, Folia Phoniatrica et Logopaedica.

[45]  T Murry,et al.  Nomenclature of voice disorders and vocal pathology. , 2000, Otolaryngologic clinics of North America.

[46]  J. Lohscheller,et al.  Phonovibrogram Visualization of Entire Vocal Fold Dynamics , 2008, The Laryngoscope.

[47]  Edward Damrose,et al.  Functional analysis of voice using simultaneous high-speed imaging and acoustic recordings. , 2007, Journal of voice : official journal of the Voice Foundation.

[48]  U Eysholdt,et al.  Functional imaging of vocal fold vibration: digital multislice high-speed kymography. , 2000, Journal of voice : official journal of the Voice Foundation.