Sound Event Detection and Haptic Vibration Based Home Monitoring Assistant System for the Deaf and Hard-of-Hearing

Acoustic signals contain a significant amount of information generated by sound sources. Unfortunately, deaf and hard-of-hearing people cannot access this information. Therefore, an assistive technology is required to help people with hearing loss. In this paper, we present a home monitoring assistant system based on sound event detection and sound-to-haptic conversion for the deaf and hard-of-hearing. The system detects the sounds erated in the home environment, converts the detected sound into text and haptic vibration, and provides them to the deaf and hard-of-hearing. The proposed approach is mainly composed of four modules, including signal estimation, reliable sensor channel selection, sound event detection, and conversion of sound into haptic vibration. During signal estimation, lost packets are recovered to improve the signal quality. Next, reliable channels are selected using a multi-channel cross-correlation coefficient to improve the computational efficiency for distant sound event detection. Finally, the sounds of the selected two channels are used for environmental sound event detection based on bidirectional gated recurrent neural networks and for sound-to-haptic effect conversion using kernel-based source separation. Experiments show that the proposed approach achieves superior performances compared to the baseline.

[1]  Prachi Sharma Wireless Sensor Networks for Environmental Monitoring , 2014 .

[2]  Jill Fain Lehman,et al.  Channel selection based on multichannel cross-correlation coefficients for distant speech recognition , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[3]  Abderrahmane Amrouche,et al.  An improved packet loss concealment technique for speech transmission in VOIP , 2018, 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP).

[4]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[5]  Tuomas Virtanen,et al.  TUT database for acoustic scene classification and sound event detection , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[6]  Andreas M. Ali,et al.  Acoustic monitoring in terrestrial environments using microphone arrays: applications, technological considerations and prospectus , 2011 .

[7]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[8]  Annamaria Mesaros,et al.  Metrics for Polyphonic Sound Event Detection , 2016 .

[9]  Deborah I. Fels,et al.  Designing the Model Human Cochlea: An Ambient Crossmodal Audio-Tactile Display , 2009, IEEE Transactions on Haptics.

[10]  Yao Liang Wireless Sensor Networks for Environmental Monitoring , 2015 .

[11]  Heikki Huttunen,et al.  Polyphonic sound event detection using multi label deep neural networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[12]  Matthias Wölfel Channel selection by class separability measures for automatic transcriptions on distant microphones , 2007, INTERSPEECH.

[13]  Eckehard G. Steinbach,et al.  Low bitrate source-filter model based compression of vibrotactile texture signals in haptic teleoperation , 2012, ACM Multimedia.

[14]  Franz Pernkopf,et al.  Gated Recurrent Networks applied to Acoustic Scene Classification , 2016, DCASE.

[15]  Changchun Bao,et al.  Speech enhancement using generalized weighted β-order spectral amplitude estimator , 2014, Speech Commun..

[16]  Athanasios Mouchtaris,et al.  Real-Time Multiple Sound Source Localization and Counting Using a Circular Microphone Array , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Derry Fitzgerald,et al.  Harmonic/Percussive Separation Using Median Filtering , 2010 .

[18]  Seungmoon Choi,et al.  Real-time perception-level translation from audio signals to vibrotactile effects , 2013, CHI.

[19]  Antoine Liutkus,et al.  Kernel Additive Models for Source Separation , 2014, IEEE Transactions on Signal Processing.

[20]  Heikki Huttunen,et al.  Multi-label vs. combined single-label sound event detection with deep neural networks , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).