Analysis of Affective Computing for Marathi Corpus using Deep Learning

Speech Emotion Recognition (SER) offers a wide range of potential uses, including strengthening human-computer interaction in virtual reality and gaming settings, enhancing the detection and tracking of mental health disorders, and enhancing the precision of speech based assistants and chat bots. It faces the challenge of cross corpus SER, intonation variations, dialects variations and prosodic changes in language due to age, gender, region, and religion, etc. This paper presents deep Convolution Neural Network based SER for Marathi language Our novel Marathi data set consists of 300 recordings of 15 speakers for Anger, Happy, Sad and Neutral emotions. The performance of the proposed DCNN is evaluated on the novel data set based on accuracy, precision, recall and F1-score. The suggested scheme provides overall accuracy of raw data is 0.4750, 0.4076 and 0.3927 for 5,10 and 15 speakers respectively and the overall accuracy after feature extraction is 0.6652, 0.6361 and 0.5800 for 5, 10 and 15 speakers respectively shows improvement in existing state of arts utilized for SER for Marathi Corpus.

[1]  Mohanaprasad Kothandaraman,et al.  Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network , 2023, Electronics.

[2]  Xin Li,et al.  Intelligent speech technologies for transcription, disease diagnosis, and medical equipment interactive control in smart hospitals: A review , 2023, Computers in Biology and Medicine.

[3]  M. Sahidullah,et al.  Modulation spectral features for speech emotion recognition using deep neural networks , 2022, Speech Commun..

[4]  Björn Schuller,et al.  An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era , 2022, Proceedings of the IEEE.

[5]  Rupali Kawade,et al.  Speech Emotion Recognition Using 1D CNN-LSTM Network on Indo-Aryan Database , 2022, 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT).

[6]  K. Bhangale,et al.  Neural Style Transfer: Reliving art through Artificial Intelligence , 2022, 2022 3rd International Conference for Emerging Technology (INCET).

[7]  Mohanaprasad Kothandaraman,et al.  Survey of Deep Learning Paradigms for Speech Processing , 2022, Wireless Personal Communications.

[8]  K. Woraratpanya,et al.  Vector learning representation for generalized speech emotion recognition , 2022, Heliyon.

[9]  Asif Iqbal Middya,et al.  Deep learning based multimodal emotion recognition using model-level fusion of audio-visual modalities , 2022, Knowl. Based Syst..

[10]  K. Bhangale,et al.  Speech Emotion Recognition Using Mel Frequency Log Spectrogram and Deep Convolutional Neural Network , 2021, Futuristic Communication and Network Technologies.

[11]  Chunyi Wang,et al.  Speech emotion recognition based on multi‐feature and multi‐lingual fusion , 2021, Multimedia Tools and Applications.

[12]  Kishor Bhangale,et al.  Multi-view Multi-pose Robust Face Recognition Based on VGGNet , 2021, Lecture Notes in Networks and Systems.

[13]  Julio Jerison E. Macrohon,et al.  Twitter Sentiment Analysis towards COVID-19 Vaccines in the Philippines Using Naïve Bayes , 2021, Inf..

[14]  Daniel Sierra-Sosa,et al.  Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models , 2021, Sensors.

[15]  Thippa Reddy Gadekallu,et al.  Cross corpus multi-lingual speech emotion recognition using ensemble learning , 2021, Complex & Intelligent Systems.

[16]  Jiahui Pan,et al.  Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN , 2020, Speech Commun..

[17]  Adnan Yazici,et al.  Speech emotion recognition with deep convolutional neural networks , 2020, Biomed. Signal Process. Control..

[18]  Hyung-Jeong Yang,et al.  Multimodal Approach of Speech Emotion Recognition Using Multi-Level Multi-Head Fusion Attention-Based Recurrent Neural Network , 2020, IEEE Access.

[19]  Budati Anil Kumar,et al.  Feature extraction algorithms to improve the speech emotion recognition rate , 2020, International Journal of Speech Technology.

[20]  Sukanya Kulkarni,et al.  Speech based Emotion Recognition using Machine Learning , 2019, 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC).

[21]  Kishor B. Bhangale,et al.  Sound based human emotion recognition using MFCC & multiple SVM , 2017, 2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC).

[22]  Malay Kishore Dutta,et al.  Speech emotion recognition with deep learning , 2017, 2017 4th International Conference on Signal Processing and Integrated Networks (SPIN).

[23]  Yassine Ben Ayed,et al.  Speech Emotion Recognition with deep learning , 2020, KES.

[24]  Ting-Wei Sun,et al.  End-to-End Speech Emotion Recognition With Gender Information , 2020, IEEE Access.

[25]  Muhammad Sajjad,et al.  Clustering-Based Speech Emotion Recognition by Incorporating Learned Features and Deep BiLSTM , 2020, IEEE Access.

[26]  N. Roopa,et al.  Speech Emotion Recognition using Deep Learning , 2019 .

[27]  K. Bhangale,et al.  Synthetic Speech Spoofing Detection Using MFCC And Radial Basis Function SVM , 2018 .