Epoch Extraction from Pathological Children Speech Using Single Pole Filtering Approach

The instant of significant excitation of the vocal tract system is referred to the epoch of the speech signal. The presence of high pitch and aperiodicity are the major challenges for the epoch extraction from the speech of pathological children. In this work, impulse-like characteristics of epochs derived from single pole filter based time-frequency representation are exploited to propose an epoch extraction algorithm for the pathological children speech. The sharp transitions present in the single pole filtered envelope at the epochs are enhanced using multi-scale product computation. Further, the combined evidence derived from the multi-scale product of the filtered envelopes at different frequencies is used to locate the epochs. The proposed algorithm is evaluated over the Saarbruecken Voice Database containing pathological children speech and simultaneously recorded electroglottographic signals. The proposed method showed better identification accuracy for pathological children speech when compared to state-of-the-art techniques.

[1]  Bayya Yegnanarayana,et al.  Single Frequency Filtering Approach for Discriminating Speech and Nonspeech , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[2]  Bayya Yegnanarayana,et al.  Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Mike Brookes,et al.  The DYPSA algorithm for estimation of glottal closure instants in voiced speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Bayya Yegnanarayana,et al.  Epoch extraction from emotional speech using single frequency filtering approach , 2017, Speech Commun..

[5]  Patrick A. Naylor,et al.  The SIGMA Algorithm: A Glottal Activity Detector for Electroglottographic Signals , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  B YEGNANARAYANA,et al.  Epoch-based analysis of speech signals , 2011 .

[7]  Thierry Dutoit,et al.  A quantitative comparison of glottal closure instant estimation algorithms on a large variety of singing sounds , 2013, INTERSPEECH.

[8]  Bayya Yegnanarayana,et al.  Epoch Extraction From Speech Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Jonas Beskow,et al.  Wavesurfer - an open source speech tool , 2000, INTERSPEECH.

[10]  Patrick A. Naylor,et al.  Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Jody Kreiman,et al.  Perception of aperiodicity in pathological voice. , 2005, The Journal of the Acoustical Society of America.

[12]  Thierry Dutoit,et al.  Glottal closure and opening instant detection from speech signals , 2019, INTERSPEECH.

[13]  Bogdan Woldert-Jokisz,et al.  Saarbruecken Voice Database , 2007 .

[14]  S. R. Mahadeva Prasanna,et al.  Determination of Instants of Significant Excitation in Speech Using Hilbert Envelope and Group Delay Function , 2007, IEEE Signal Processing Letters.

[15]  B. Yegnanarayana,et al.  Epoch extraction from linear prediction residual for identification of closed glottis interval , 1979 .

[16]  Raymond N. J. Veldhuis,et al.  Extraction of vocal-tract system characteristics from speech signals , 1998, IEEE Trans. Speech Audio Process..

[17]  A. G. Ramakrishnan,et al.  Epoch Extraction Based on Integrated Linear Prediction Residual Using Plosion Index , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  S. R. Mahadeva Prasanna,et al.  Epoch Extraction From Telephone Quality Speech Using Single Pole Filter , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  B. Yegnanarayana,et al.  Epoch extraction of voiced speech , 1975 .

[20]  S. R. Mahadeva Prasanna,et al.  Zero Frequency Filter Based Analysis of Voice Disorders , 2017, INTERSPEECH.

[21]  Saio Tomaiic,et al.  On short-time Fourier transform with single-sided exponential window , 1996 .

[22]  Patrick A. Naylor,et al.  Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm , 2012, IEEE Transactions on Audio, Speech, and Language Processing.