Patient Facial Emotion Recognition and Sentiment Analysis Using Secure Cloud with Hardware Acceleration

Abstract This chapter conducts sentiment analysis of medical patients through facial emotion recognition. In this work, a patient's facial expression is dynamically captured from a video stream, recognized using machine learning-based image classifiers, and put into one of the result categories including angry, disgusted, fearful, happy, sad, surprised, and neural. The system design includes three primary components: a video processor, a face detector, and an image classifier. The video processor captures a frame with a patient face from its video feed, converting it to a grayscale image that is then passed to the face detector. The face detector performs face detection by cropping the image around the face. The cropped face image is then resized and fed into the image classifier that will eventually recognize the facial emotion. The current state-of-the-art facial emotion recognition model makes use of a convolutional neural network (CNN) with three hidden layers and a linear support vector machine as the output layer. The model presented in this chapter expands on the state-of-the-art model by using pre-processed video stream frames as the input to our CNN model. Face detection, RGB to grayscale color conversion, and Gaussian normalization are applied to the image before it is input to the CNN. Besides expanding on the architecture of the state-of-the-art model, this work also adds an additional optimization step during training that detects saturation in the validation accuracy and decreases the learning rate in an attempt to combat this saturation. This added optimization allows the model to “narrow-in” on the values of trainable parameters that minimize the output value of the error function. The implemented system shows inference accuracy comparable to state-of-the-art facial emotion classifiers. Furthermore, the proposed design is carried out on a remote, cloud-based GPU platform using programming techniques developed specifically for the GPU devices (i.e., the CUDA language). As a result, the model training is significantly accelerated compared to being run on a traditional CPU. This also enables the system to be run in real time. Finally, a case study has been conducted to automatically understand patients' pain intensity levels using the proposed facial emotion recognition system. This framework, named “DeepPain”, correlates the pain intensity with the probabilities that the patient image falls into different facial emotion categories. In particular, two emotion probabilities, “fearful” and “sad”, are determined to be closely related to the pain intensity level, and used as the input to the DeepPain framework. Detecting and understanding pain intensity at a fine-grained level, as performed by DeepPain, will play a crucial role in healthcare.