Sinusoidal model-based hypernasality detection in cleft palate speech using CVCV sequence

Abstract Hypernasality in the speech of children with cleft palate is a consequence of velopharyngeal insufficiency. The spectral analysis of hypernasal speech shows the presence of nasal formants and anti-formants in the spectrum which affects the harmonic-intensity. The nasal formants increase whereas the anti-formants decrease the magnitude of harmonics around its location of addition. Hence, the spectrum of hypernasal and normal speech is different from each other. To capture the spectral difference, three features namely, normalized harmonic amplitude (NHA), harmonic amplitude ratio (HAR), and prominent harmonics frequency (PHF) are proposed in this work. NHA feature is the magnitude of harmonics after their normalization with respect to the maximum magnitude, HAR feature is the relative magnitude of harmonics with respect to their previous harmonics, and the PHF feature is the frequencies of prominent harmonics in the spectrum. The combination of three features gives an accuracy of 82.46%, 87.89%, 84.25% for /a/, /i/ and /u/ vowels respectively for the detection of hypernasality using support vector machine classifier.

[1]  D C SPRIESTERSBACH,et al.  Assessing nasal quality in cleft palate speech of children. , 1955, The Journal of speech and hearing disorders.

[2]  A. Kummer,et al.  Evaluation and Treatment of Resonance Disorders , 1996 .

[3]  M. Ramasubba Reddy,et al.  Acoustic Analysis and Detection of Hypernasality Using a Group Delay Function , 2007, IEEE Transactions on Biomedical Engineering.

[4]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[5]  Panying Rong,et al.  Automatic identification of hypernasality in normal and cleft lip and palate patients with acoustic analysis of speech. , 2017, The Journal of the Acoustical Society of America.

[6]  Mohamed A. Deriche,et al.  Feature selection using a mutual information based measure , 2002, Object recognition supported by user interaction for service robots.

[7]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[8]  Jesús Francisco Vargas-Bonilla,et al.  Automatic Selection of Acoustic and Non-Linear Dynamic Features in Voice Signals for Hypernasality Detection , 2011, INTERSPEECH.

[9]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[10]  Max A. Little,et al.  Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection , 2007, Biomedical engineering online.

[11]  Bayya Yegnanarayana,et al.  Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  T. Baer,et al.  Harmonics-to-noise ratio as an index of the degree of hoarseness. , 1982, The Journal of the Acoustical Society of America.

[13]  Chulhee Lee,et al.  A Noninvasive Estimation of Hypernasality Using a Linear Predictive Model , 2004, Annals of Biomedical Engineering.

[14]  A Boothroyd,et al.  Assessment of nasalization in the speech of deaf children. , 1976, Journal of speech and hearing research.

[15]  D. W. Warren,et al.  The relationship between spectral characteristics and perceived hypernasality in children. , 2001, The Journal of the Acoustical Society of America.

[16]  Y Horii,et al.  An accelerometric measure as a physical correlate of perceived hypernasality in speech. , 1983, Journal of speech and hearing research.

[17]  M F Schwartz The acoustics of normal and nasal vowel production. , 1968, The Cleft palate journal.

[18]  Schuster,et al.  Easily calculable measure for the complexity of spatiotemporal patterns. , 1987, Physical review. A, General physics.

[19]  Hynek Hermansky,et al.  Spectral envelope sampling and interpolation in linear predictive analysis of speech , 1984, ICASSP.

[20]  Jing Zhang,et al.  Automatic Evaluation of Hypernasality and Consonant Misarticulation in Cleft Palate Speech , 2014, IEEE Signal Processing Letters.

[21]  S Dandapat,et al.  Detection of hypernasality based on vowel space area. , 2018, The Journal of the Acoustical Society of America.

[22]  M. Rosenstein,et al.  A practical method for calculating largest Lyapunov exponents from small data sets , 1993 .

[23]  J.H.L. Hansen,et al.  A noninvasive technique for detecting hypernasal speech using a nonlinear operator , 1996, IEEE Transactions on Biomedical Engineering.

[24]  M. Schuster,et al.  Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition. , 2006, International journal of pediatric otorhinolaryngology.

[25]  Setsuko Imatomi,et al.  Effects of Breathy Voice Source on Ratings of Hypernasality , 2005, The Cleft palate-craniofacial journal : official publication of the American Cleft Palate-Craniofacial Association.

[26]  Terry B. J. Kuo,et al.  Voice low tone to high tone ratio: a potential quantitative index for vowel [a:] and its nasalization , 2006, IEEE Transactions on Biomedical Engineering.

[27]  Madalena Costa,et al.  Multiscale entropy analysis of biological signals. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  S. R. Mahadeva Prasanna,et al.  Objective assessment of cleft lip and palate speech intelligibility using articulation and hypernasality measures. , 2019, The Journal of the Acoustical Society of America.

[29]  Guus de Krom,et al.  A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals , 1993 .

[30]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[31]  A. Giovanni,et al.  Nonlinear behavior of vocal fold vibration: the role of coupling between the vocal folds. , 1999, Journal of voice : official journal of the Voice Foundation.

[32]  David P Kuehn,et al.  Universal Parameters for Reporting Speech Outcomes in Individuals with Cleft Palate , 2008, The Cleft palate-craniofacial journal : official publication of the American Cleft Palate-Craniofacial Association.

[33]  S Hawkins,et al.  Acoustic and perceptual correlates of the non-nasal--nasal distinction for vowels. , 1985, The Journal of the Acoustical Society of America.

[34]  Jesús Francisco Vargas-Bonilla,et al.  Automatic Detection of Hypernasality in Children , 2011, IWINAC.

[35]  S. Ramamohan,et al.  Sinusoidal model-based analysis and classification of stressed speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[36]  F. Wuyts,et al.  Instrumental assessment of velopharyngeal function and resonance: a review. , 2014, Journal of communication disorders.

[37]  Elmar Nöth,et al.  Automatic detection of articulation disorders in children with cleft lip and palate. , 2009, The Journal of the Acoustical Society of America.

[38]  P. Grassberger,et al.  Measuring the Strangeness of Strange Attractors , 1983 .

[39]  A. Reich,et al.  Correspondence between an accelerometric nasal/voice amplitude ratio and listeners' direct magnitude estimations of hypernasality. , 1985, Journal of speech and hearing research.

[40]  Okko Johannes Räsänen,et al.  Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits , 2015, Comput. Speech Lang..