Automated Facial Expression and Speech Emotion Recognition App Development on Smart Phones using Cloud Computing

Because emotions are such a significant and integral part of being human, understanding them and knowing how to react to the emotions of others is a fundamental requirement for successful social interaction. We recognize emotions primarily through speech and facial expression. The topic is gaining importance in academic research triggered by research in new techniques such as identifying emotions based on speech context. This investigates the relationship between emotions and the content of our speech. This paper proposes how emotion in speech and facial expression can be recognized in real time using a framework consisting of mobile phone technology backed by cloud computing. This functionality was developed and built into a mobile phone application. Currently, the application works on any Android smartphone to detect and identify emotions in real time. The results are expressed as a percentage of all the possible emotions, such as sad, happy, fear, surprise, anger and so on. The results of the experiment confirm that face and speech emotion recognition was conducted successfully using a smartphone. It was correct in 97.26% of instances when used with standard corpora: a.Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) b.the Surrey Audio-Visual Expressed Emotion (SAVEE)

[1]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[2]  Yurong Chen,et al.  Capturing AU-Aware Facial Features and Their Latent Relations for Emotion Recognition in the Wild , 2015, ICMI.

[3]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[4]  George Ghinea,et al.  Gradient-Orientation-Based PCA Subspace for Novel Face Recognition , 2014, IEEE Access.

[5]  Muhamad Taufik Abdullah,et al.  Region-Based Facial Expression Recognition in Still Images , 2013, J. Inf. Process. Syst..

[6]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Yuanliu Liu,et al.  Video-based emotion recognition using CNN-RNN and C3D hybrid networks , 2016, ICMI.

[8]  Gwen Littlewort,et al.  Machine learning methods for fully automatic recognition of facial expressions and facial actions , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[9]  Zhengyou Zhang,et al.  Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[10]  Dennis Gabor,et al.  Theory of communication , 1946 .

[11]  Zhiyong Feng,et al.  Facial Expression Pervasive Analysis Based on Haar-Like Features and SVM , 2012 .

[12]  Boyang Li,et al.  Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization , 2015, IEEE Transactions on Affective Computing.

[13]  Singh Satyanand,et al.  MFCC VQ based Speaker Recognition and Its Accuracy Affecting Factors , 2011 .

[14]  Thomas S. Huang,et al.  Do Deep Neural Networks Learn Facial Action Units When Doing Expression Recognition? , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[15]  Christian Wolf,et al.  ModDrop: Adaptive Multi-Modal Gesture Recognition , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Hichem Sahli,et al.  Context-Independent Facial Action Unit Recognition Using Shape and Gabor Phase Information , 2011, ACII.

[17]  B. Mesquita,et al.  Context in Emotion Perception , 2011 .

[18]  Carlos Orrite-Uruñuela,et al.  HOG-Based Decision Tree for Facial Expression Classification , 2009, IbPRIA.

[19]  Marian Stewart Bartlett,et al.  Facial expression recognition using Gabor motion energy filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[20]  Loong Fah Cheong,et al.  Two-Stream Flow-Guided Convolutional Attention Networks for Action Recognition , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[21]  Myung Jin Chung,et al.  Efficient rectangle feature extraction for real-time facial expression recognition based on AdaBoost , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Gwen Littlewort,et al.  Dynamics of Facial Expression Extracted Automatically from Video , 2004, CVPR Workshops.

[23]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Xiong Chen,et al.  Automatic Speech Emotion Recognition using Support Vector Machine , 2011, Proceedings of 2011 International Conference on Electronic & Mechanical Engineering and Information Technology.

[25]  Seong-Whan Lee,et al.  Face detection and facial feature extraction using support vector machines , 2002, Object recognition supported by user interaction for service robots.

[26]  Jean Meunier,et al.  Emotion recognition using dynamic grid-based HoG features , 2011, Face and Gesture 2011.

[27]  Gwen Littlewort,et al.  Fully Automatic Facial Action Recognition in Spontaneous Behavior , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[28]  Wenwu Wang,et al.  Machine Audition: Principles, Algorithms and Systems , 2010 .

[29]  Ryohei Nakatsu A speech recognition machine for connected words , 1980, ICASSP.

[30]  Wioleta Szwoch,et al.  Emotion recognition and its application in software engineering , 2013, 2013 6th International Conference on Human System Interactions (HSI).

[31]  Chien-Cheng Lee,et al.  Gabor Feature Selection and Improved Radial Basis Function Networks for Facial Expression Recognition , 2010, 2010 International Conference on Information Science and Applications.

[32]  Maja Pantic,et al.  AFEW-VA database for valence and arousal estimation in-the-wild , 2017, Image Vis. Comput..

[33]  Christian Wolf,et al.  Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification , 2012, BMVC.

[34]  Maja Pantic,et al.  Discriminative Shared Gaussian Processes for Multiview and View-Invariant Facial Expression Recognition , 2015, IEEE Transactions on Image Processing.

[35]  Maja J. Mataric,et al.  A Framework for Automatic Human Emotion Classification Using Emotion Profiles , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[36]  J. Russell,et al.  Science Current Directions in Psychological the Structure of Current Affect : Controversies and Emerging Consensus on Behalf Of: Association for Psychological Science , 2022 .

[37]  Turgay Korkmaz,et al.  Applications of Support Vector Machines in Bioinformatics and Network Security , 2010 .

[38]  Marcelo M. Wanderley,et al.  Common cues to emotion in the dynamic facial expressions of speech and song , 2014, Quarterly journal of experimental psychology.

[39]  Sascha Meudt,et al.  Revisiting the EmotiW challenge: how wild is it really? , 2015, Journal on Multimodal User Interfaces.

[40]  Zheru Chi,et al.  Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning , 2014, ICMI.

[41]  Peter Washington,et al.  A Wearable Social Interaction Aid for Children with Autism , 2016, CHI Extended Abstracts.

[42]  P. Ekman An argument for basic emotions , 1992 .