论文信息 - Speech Emotion Detection using IoT based Deep Learning for Health Care

Speech Emotion Detection using IoT based Deep Learning for Health Care

Human emotions are essential to recognize the behavior and state of mind of a person. Emotion detection through speech signals has started to receive more attention lately. This paper proposes the method for detecting human emotions using speech signals and its implementation in real-time using the Internet of Things (IoT) based deep learning for the care of older adults in nursing homes. The research has two main contributions. First, we have implemented a real-time system based on audio IoT, where we have recorded human voice and predicted emotions via deep learning. Secondly, for advance classification, we have designed a model using data normalization and data augmentation techniques. Finally, we have created an integrated deep learning model, called Speech Emotion Detection (SED), using a 2D convolutional neural networks (CNN). The best accuracy that was reported by our method was approximately 95%, which outperformed all state-of-the-art approaches. We have further extended to apply the SED model to a live audio sentiment analysis system with IoT technologies for the care of older adults in nursing homes.

Yugyung Lee | Zeenat Tariq | Sayed Khushal Shah | Yugyung Lee | Zeenat Tariq

[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2] Souvik Mallik,et al. Development and performance analysis of a low-cost MEMS microphone-based hearing aid with three different audio amplifiers , 2019, Innovations in Systems and Software Engineering.

[3] S. R. Livingstone,et al. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English , 2018, PloS one.

[4] Justin Salamon,et al. Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[5] Bhaskar Krishnamachari,et al. Exploiting IoT technologies for enhancing Health Smart Homes through patient identification and emotion recognition , 2016, Comput. Commun..

[6] Yugyung Lee,et al. Smart 311 Request System with Automatic Noise Detection for Safe Neighborhood , 2018, 2018 IEEE International Smart Cities Conference (ISC2).

[7] Huaimin Wang,et al. Sample Mixed-Based Data Augmentation for Domestic Audio Tagging , 2018, DCASE.

[8] José Manuel Pastor,et al. Software Architecture for Smart Emotion Recognition and Regulation of the Ageing Adult , 2016, Cognitive Computation.

[9] Hatice Gunes,et al. Bi-modal emotion recognition from expressive face and body gestures , 2007, J. Netw. Comput. Appl..

[10] Nithya Davis,et al. Environmental Sound Classification Using Deep Convolutional Neural Networks and Data Augmentation , 2018, 2018 IEEE Recent Advances in Intelligent Computational Systems (RAICS).

[11] Walid Mahdi,et al. Improving speech recognition using data augmentation and acoustic model fusion , 2017, KES.

[12] 王海龙,et al. Raspberry Pi Model B , 2012 .

[13] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.

[14] Shaun J. Canavan,et al. Ubiquitous Emotion Recognition Using Audio and Video Data , 2018, UbiComp/ISWC Adjunct.

[15] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[16] L. Rafael Aguiar,et al. Exploring Data Augmentation to Improve Music Genre Classification with ConvNets , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[17] Fabio Paternò,et al. Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema , 2012, International Journal of Speech Technology.

[18] Wootaek Lim,et al. Speech emotion recognition using convolutional and Recurrent Neural Networks , 2016, 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[19] Wan Khairunizam,et al. Implementation of wavelet packet transform and non linear analysis for emotion classification in stroke patient using brain signals , 2017, Biomed. Signal Process. Control..

[20] Yugyung Lee,et al. Audio IoT Analytics for Home Automation Safety , 2018, 2018 IEEE International Conference on Big Data (Big Data).