Analyzing Breath Signals for the Interspeech 2020 ComParE Challenge

This paper presents our contribution to the INTERSPEECH 2020 Breathing Sub-challenge. Besides fulfilling the main goal of the challenge, which involves the automatic prediction from conversational speech of the breath signals obtained from respiratory belts, we also analyse both original and predicted signals in an attempt to overcome the main pitfalls of the proposed systems. In particular, we identify the subsets of most irregular belt signals which yield the worst performance, measured by the Pearson correlation coefficient, and show how they affect the results that were obtained by both the baseline end-to-end system and variants such as a Bidirectional LSTM. The performance of this type of architecture indicates that future information is also relevant when predicting breathing patterns. We also study the information retained from the AM-FM decomposition of the speech signal for this purpose, showing how the AM component significantly outperforms the FM component on all experiments, but fails to surpass the prediction results obtained using the original speech signal. Finally, we validate the system’s performance in videoconferencing conditions by using data augmentation and compare clinically relevant parameters, such as breathing rate, from both the original belt signals and the ones predicted from the simulated video-conferencing signals.

[1]  Amélie Rochet-Capellan,et al.  The interplay of linguistic structure and breathing in German spontaneous speech , 2013, INTERSPEECH.

[2]  Björn W. Schuller,et al.  The INTERSPEECH 2020 Computational Paralinguistics Challenge: Elderly Emotion, Breathing & Masks , 2020, INTERSPEECH.

[3]  J. Hirsch,et al.  Respiratory sinus arrhythmia in humans: how breathing pattern modulates heart rate. , 1981, The American journal of physiology.

[4]  Petr Motlícek,et al.  Autoregressive Models of Amplitude Modulations in Audio Compression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[6]  Evžen Růžička,et al.  Automated analysis of connected speech reveals early biomarkers of Parkinson’s disease in patients with rapid eye movement sleep behaviour disorder , 2017, Scientific Reports.

[7]  J. Mead,et al.  Measurement of the separate volume changes of rib cage and abdomen during breathing. , 1967, Journal of applied physiology.

[8]  H. Dudley The carrier nature of speech , 1940 .

[9]  Timothy B. Terriberry,et al.  High-Quality, Low-Delay Music Coding in the Opus Codec , 2016, ArXiv.

[10]  Björn W. Schuller,et al.  openXBOW - Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit , 2016, J. Mach. Learn. Res..

[11]  Steve Renals,et al.  Cross Lingual Transfer Learning for Zero-Resource Domain Adaptation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Susanne Fuchs,et al.  Changes in speech and breathing rate while speaking and biking , 2015, ICPhS.

[13]  P. Motlícek,et al.  AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS , 2020 .

[14]  Simon J. Julier,et al.  DeepBreath: Deep learning of breathing patterns for automatic stress recognition using low-cost thermal imaging in unconstrained settings , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[15]  I. Homma,et al.  Breathing rhythms and emotions , 2008, Experimental physiology.

[16]  Helmer Strik,et al.  Deep Sensing of Breathing Signal During Conversational Speech , 2019, INTERSPEECH.

[17]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[18]  M. Younes,et al.  Breathing pattern during and after maximal exercise in patients with chronic obstructive lung disease, interstitial lung disease, and cardiac disease, and in normal subjects. , 2015, The American review of respiratory disease.

[19]  J B Korten,et al.  Respiratory waveform pattern recognition using digital techniques. , 1989, Computers in biology and medicine.

[20]  D. Mant,et al.  Normal ranges of heart rate and respiratory rate in children from birth to 18 years of age: a systematic review of observational studies , 2011, The Lancet.