Phoneme Based Respiratory Analysis of Read Speech

Recent work shows that it is possible to use deep learning techniques to sense the speaker's respiratory parameters directly from a speech signal. This can be a beneficial option for future telehealth services. In this paper, we dive deeper and study how respiratory effort depends on the linguistic content of the speech utterance. This is obtained by analysis of respiratory belt sensor data and phoneme-aligned speech data. The results show, for example, that the respiratory effort was highest for fricatives, compared to other broad phonetic classes, and especially high for the glottal consonants. The insights may help to develop more efficient protocols for respiratory health monitoring in telehealth applications.

[1]  Björn W. Schuller,et al.  The INTERSPEECH 2020 Computational Paralinguistics Challenge: Elderly Emotion, Breathing & Masks , 2020, INTERSPEECH.

[2]  Helmer Strik,et al.  Speech Breathing Estimation Using Deep Learning Methods , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Helmer Strik,et al.  Deep Sensing of Breathing Signal During Conversational Speech , 2019, INTERSPEECH.

[4]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[5]  Raymond D. Kent,et al.  Breath Group Analysis for Reading and Spontaneous Speech in Healthy Adults , 2010, Folia Phoniatrica et Logopaedica.

[6]  Mark Liberman,et al.  Robust speaking rate estimation using broad phonetic class recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Hermann Ney,et al.  Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[8]  Roger K. Moore,et al.  Language identification: insights from the classification of hand annotated phone transcripts , 2008, Odyssey.

[9]  Daniel P. W. Ellis,et al.  Using Broad Phonetic Group Experts for Improved Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  L. Yang,et al.  Acoustic Units Selection in Chinese-English Bilingual Speech Recognition , 2007 .

[11]  John H Arnold,et al.  Noninvasive assessment of lung volume: Respiratory inductance plethysmography and electrical impedance tomography , 2005, Critical care medicine.

[12]  P. Ashby J. C. CATFORD, A Practical Introduction to Phonetics (2nd edn.). Oxford: Oxford University Press, 2001. Pp xiii + 229. ISBN 0-19-924635–1 , 2003, Journal of the International Phonetic Association.

[13]  Hermann Ackermann,et al.  Phonemic Vowel Length Contrasts in Cerebellar Disorders , 1999, Brain and Language.

[14]  A. Winkworth,et al.  Variability and consistency in speech breathing during reading: lung volumes, speech intensity, and linguistic factors. , 1994, Journal of speech and hearing research.

[15]  Stathopoulos Et Oral air flow during vowel production of children and adults. , 1984 .

[16]  L. L. La Pointe,et al.  Some phonemic characteristics in apraxia of speech. , 1975, Journal of communication disorders.

[17]  Kenneth N. Stevens,et al.  STUDIES OF ARTICULATORY ACTIVITY AND AIRFLOW DURING SPEECH * , 1968 .

[18]  J. Mead,et al.  Measurement of the separate volume changes of rib cage and abdomen during breathing. , 1967, Journal of applied physiology.

[19]  R. Ringel,et al.  AIR FLOW DURING THE PRODUCTION OF SELECTED CONSONANTS. , 1964, Journal of speech and hearing research.