Detection of Nasalized Voiced Stops in Cleft Palate Speech Using Epoch-Synchronous Features

The presence of velopharyngeal dysfunction in individuals with cleft palate (CP) nasalizes the voiced stops. Due to this, voiced stops (/b/, /d/, /g/) tend to be perceive like nasal consonants (/m/, /n/, /ng/). In this work, a novel algorithm is proposed for the detection of nasalized voiced stops in CP speech using epoch-synchronous features. Speech regions corresponding to consonant and consonant-vowel transitions are segmented using the knowledge of glottal activity, syllable nucleus, low-frequency spectral dominance, and vowel onset point. The segmented regions are epoch-synchronously processed to analyze the spectral, spectro-temporal, excitation source, and periodicity characteristics of normal and nasalized voiced stops. Spectral and spectro temporal features are computed using single pole filter based time-frequency representation. The amplitude of Hilbert envelope of linear prediction residual, measured around the epoch is used to analyze the effect of nasalization on excitation source. Comparison of speech frames of successive inter-epoch intervals is carried out to analyze the periodicity characteristics. The proposed features are used to develop a support vector machine classifier for the classification of normal and nasalized voiced stops. Segmentation accuracy for the proposed knowledge based method is found to be better than the hidden Markov model based force-alignment approach. The detection rate of nasalized voiced stops is found to be high for the proposed epoch synchronous features than the conventional Mel-frequency cepstral coefficients.

[1]  David P Kuehn,et al.  Universal Parameters for Reporting Speech Outcomes in Individuals with Cleft Palate , 2008, The Cleft palate-craniofacial journal : official publication of the American Cleft Palate-Craniofacial Association.

[2]  Michael P. Karnell,et al.  Cleft Palate Speech: Assessment and Intervention: Howard/Cleft Palate Speech: Assessment and Intervention , 2011 .

[3]  S. R. Mahadeva Prasanna,et al.  Detection of Glottal Activity Errors in Production of Stop Consonants in Children with Cleft Lip and Palate , 2018, INTERSPEECH.

[4]  Jing Zhang,et al.  Automatic Evaluation of Hypernasality and Consonant Misarticulation in Cleft Palate Speech , 2014, IEEE Signal Processing Letters.

[5]  Bayya Yegnanarayana,et al.  Spectro-temporal analysis of speech signals using zero-time windowing and group delay function , 2013, Speech Commun..

[6]  S Dandapat,et al.  Detection of hypernasality based on vowel space area. , 2018, The Journal of the Acoustical Society of America.

[7]  S. R. Mahadeva Prasanna,et al.  Epoch Extraction from Pathological Children Speech Using Single Pole Filtering Approach , 2018, INTERSPEECH.

[8]  R. Smits Accuracy of quasistationary analysis of highly dynamic speech signals , 1994 .

[9]  A. Liberman,et al.  The role of consonant-vowel transitions in the perception of the stop and nasal consonants. , 1954 .

[10]  B YEGNANARAYANA,et al.  Epoch-based analysis of speech signals , 2011 .

[11]  Preeti Rao,et al.  Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling , 2012, Speech Commun..

[12]  Roman Cmejla,et al.  Automatic Evaluation of Articulatory Disorders in Parkinson’s Disease , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  Patrick A. Naylor,et al.  Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Michal Novotný,et al.  Hypernasality associated with basal ganglia dysfunction: evidence from Parkinson’s disease and Huntington’s disease , 2016, PeerJ.

[15]  K. Sreenivasa Rao,et al.  Spotting and Recognition of Consonant-Vowel Units from Continuous Speech Using Accurate Detection of Vowel Onset Points , 2012, Circuits, Systems, and Signal Processing.

[16]  Raymond D. Kent,et al.  Acoustic–Phonetic Descriptions of Speech Production in Speakers with Cleft Palate and Other Velopharyngeal Disorders , 1984 .

[17]  D. Kuehn,et al.  Speech and Language Issues in the Cleft Palate Population: The State of the Art , 2000 .

[18]  S. R. Mahadeva Prasanna,et al.  Epoch Extraction From Telephone Quality Speech Using Single Pole Filter , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  S. Furui On the role of spectral transition for speech perception. , 1986, The Journal of the Acoustical Society of America.

[20]  A. John,et al.  Cleft audit protocol for speech (CAPS-A): a comprehensive training package for speech analysis. , 2009, International journal of language & communication disorders.

[21]  A G Ramakrishnan,et al.  Estimation of voice-onset time in continuous speech using temporal measures. , 2014, The Journal of the Acoustical Society of America.

[22]  Raymond N. J. Veldhuis,et al.  Extraction of vocal-tract system characteristics from speech signals , 1998, IEEE Trans. Speech Audio Process..

[23]  A. Harding,et al.  Characteristics of cleft palate speech. , 1996, European journal of disorders of communication : the journal of the College of Speech and Language Therapists, London.

[24]  D. Sell,et al.  A screening assessment of cleft palate speech (Great Ormond Street Speech Assessment). , 1994, European journal of disorders of communication : the journal of the College of Speech and Language Therapists, London.

[25]  Bayya Yegnanarayana,et al.  Epoch Extraction From Speech Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Bayya Yegnanarayana,et al.  Features for automatic detection of voice bars in continuous speech , 2008, INTERSPEECH.

[27]  Mary J. Sandage,et al.  Acquired Velopharyngeal Dysfunction: Survey, Literature Review, and Clinical Recommendations. , 2018, American journal of speech-language pathology.

[28]  E. Castilla,et al.  Global registry and database on craniofacial anomalies , 2001 .

[29]  Jing Zhang,et al.  Automatic detection of glottal stop in cleft palate speech , 2018, Biomed. Signal Process. Control..

[30]  Elmar Nöth,et al.  Automatic detection of articulation disorders in children with cleft lip and palate. , 2009, The Journal of the Acoustical Society of America.

[31]  Saio Tomaiic,et al.  On short-time Fourier transform with single-sided exponential window , 1996 .

[32]  M. Ramasubba Reddy,et al.  Acoustic Analysis and Detection of Hypernasality Using a Group Delay Function , 2007, IEEE Transactions on Biomedical Engineering.

[33]  A. Kummer,et al.  Cleft Palate and Craniofacial Anomalies: Effects on Speech and Resonance , 2007 .