Automatic Evaluation of Hypernasality Based on a Cleft Palate Speech Database

The hypernasality is one of the most typical characteristics of cleft palate (CP) speech. The evaluation outcome of hypernasality grading decides the necessity of follow-up surgery. Currently, the evaluation of CP speech is carried out by experienced speech therapists. However, the result strongly depends on their clinical experience and subjective judgment. This work aims to propose an automatic evaluation system for hypernasality grading in CP speech. The database tested in this work is collected by the Hospital of Stomatology, Sichuan University, which has the largest number of CP patients in China. Based on the production process of hypernasality, source sound pulse and vocal tract filter features are presented. These features include pitch, the first and second energy amplified frequency bands, cepstrum based features, MFCC, short-time energy in the sub-bands features. These features combined with KNN classier are applied to automatically classify four grades of hypernasality: normal, mild, moderate and severe. The experiment results show that the proposed system achieves a good performance. The classification rates for four hypernasality grades reach up to 80.4 %. The sensitivity of proposed features to the gender is also discussed.

[1]  Jesús Francisco Vargas-Bonilla,et al.  Automatic Selection of Acoustic and Non-Linear Dynamic Features in Voice Signals for Hypernasality Detection , 2011, INTERSPEECH.

[2]  Wang Guang-h The Correlation Between Spectrum Manifestations of Nasalized Vowels and Listener Judgments of Hypernasality , 2003 .

[3]  Craig A. Champlin Hearing An Introduction to Psychological and Physiological Acoustics (3rd edition) , 1999 .

[4]  Jing Zhang,et al.  Automatic Evaluation of Hypernasality and Consonant Misarticulation in Cleft Palate Speech , 2014, IEEE Signal Processing Letters.

[5]  Wang Tie-mei Acoustic characteristics of misarticulation in patients without velopharyngeal incompetence , 2004 .

[6]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[7]  Zhu Yun Analysing misarticulation of post operation cleft palate speech applying acoustic technology , 2001 .

[8]  A. Harding,et al.  Characteristics of cleft palate speech. , 1996, European journal of disorders of communication : the journal of the College of Speech and Language Therapists, London.

[9]  Daniel Jurafsky,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2009, Prentice Hall series in artificial intelligence.

[10]  Robert I. Damper,et al.  Classification of emotional speech using 3DEC hierarchical classifier , 2012, Speech Commun..

[11]  Elmar Nöth,et al.  Automatic evaluation of characteristic speech disorders in children with cleft lip and palate , 2008, INTERSPEECH.

[12]  David L. Jones,et al.  Efficacy of Continuous Positive Airway Pressure for Treatment of Hypernasality , 2002, The Cleft palate-craniofacial journal : official publication of the American Cleft Palate-Craniofacial Association.

[13]  Hongbing Hu,et al.  A spectral/temporal method for robust fundamental frequency tracking. , 2008, The Journal of the Acoustical Society of America.

[14]  Peter J Murphy,et al.  Noise estimation in voice signals using short-term cepstral analysis. , 2007, The Journal of the Acoustical Society of America.

[15]  Osman Erogul,et al.  Down Syndrome Diagnosis Based on Gabor Wavelet Transform , 2012, Journal of Medical Systems.

[16]  M. Hariharan,et al.  Luminance Sticker Based Facial Expression Recognition Using Discrete Wavelet Transform for Physically Disabled Persons , 2012, Journal of Medical Systems.

[17]  Goutam Saha,et al.  Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter , 2009 .

[18]  Wen Bo Spectro-mode features of unaspirated consonants by speakers with cleft palate , 2004 .

[19]  Mu Yu Evaluation of the velopharyngeal function with the sonagraph , 2010 .

[20]  Wen Yi-xi Analysis study of /i/ nasal formant in different types of cleft palate children , 2011 .

[21]  Yingchun Yang,et al.  Mandarin isolated words recognition method based on pitch contour , 2012, 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems.

[22]  Lie Lu,et al.  Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..

[23]  Xuecai Yang,et al.  [The analysis of formant characteristics of vowels in the speech of patient with cleft palate]. , 2003, Hua xi kou qiang yi xue za zhi = Huaxi kouqiang yixue zazhi = West China journal of stomatology.

[24]  Wei Jian-hua Acoustic technology analysis consonants articulated by postoperative patuents with cleft palate , 2003 .

[25]  Jesús Francisco Vargas-Bonilla,et al.  Nonlinear Dynamics for Hypernasality Detection , 2011, NOLISP.

[26]  Stanley A. Gelfand,et al.  Hearing: An Introduction to Psychological and Physiological Acoustics, Fourth Edition , 1998 .

[27]  S. Casale,et al.  Speech Emotion Recognition Using MFCCs Extracted from a Mobile Terminal based on ETSI Front End , 2006, 2006 8th international Conference on Signal Processing.

[28]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[29]  M. Ramasubba Reddy,et al.  Acoustic Analysis and Detection of Hypernasality Using a Group Delay Function , 2007, IEEE Transactions on Biomedical Engineering.

[30]  Juan Carlos,et al.  Review of "Discrete-Time Speech Signal Processing - Principles and Practice", by Thomas Quatieri, Prentice-Hall, 2001 , 2003 .

[31]  Heng Yin,et al.  Automatic evaluation of hypernasality and speech intelligibility for children with cleft palate , 2013, 2013 IEEE 8th Conference on Industrial Electronics and Applications (ICIEA).

[32]  Elmar Nöth,et al.  Automatic detection of articulation disorders in children with cleft lip and palate. , 2009, The Journal of the Acoustical Society of America.

[33]  Jesús Francisco Vargas-Bonilla,et al.  Automatic Detection of Hypernasality in Children , 2011, IWINAC.