Training Deep Neural Networks with Different Datasets In-the-wild: The Emotion Recognition Paradigm

A novel procedure is presented in this paper, for training a deep convolutional and recurrent neural network, taking into account both the available training data set and some information extracted from similar networks trained with other relevant data sets. This information is included in an extended loss function used for the network training, so that the network can have an improved performance when applied to the other data sets, without forgetting the learned knowledge from the original data set. Facial expression and emotion recognition in-the-wild is the test bed application that is used to demonstrate the improved performance achieved using the proposed approach. In this framework, we provide an experimental study on categorical emotion recognition using datasets from a very recent related emotion recognition challenge.

[1]  Marcus Liwicki,et al.  Scene labeling with LSTM recurrent neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[3]  Ping Hu,et al.  Learning supervised scoring ensemble for emotion recognition in the wild , 2017, ICMI.

[4]  R. Plutchik Emotion, a psychoevolutionary synthesis , 1980 .

[5]  Cynthia Whissell,et al.  THE DICTIONARY OF AFFECT IN LANGUAGE , 1989 .

[6]  Jesse Hoey,et al.  EmotiW 2016: video and group-level emotion recognition challenges , 2016, ICMI.

[7]  Tamás D. Gedeon,et al.  Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol , 2014, ICMI.

[8]  P. Ekman,et al.  Emotion in the Human Face: Guidelines for Research and an Integration of Findings , 1972 .

[9]  Fabien Ringeval,et al.  Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[10]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[11]  Tamás D. Gedeon,et al.  Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015 , 2015, ICMI.

[12]  Stefan Wermter,et al.  The OMG-Emotion Behavior Dataset , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[13]  Frédéric Jurie,et al.  Temporal multimodal fusion for video emotion classification in the wild , 2017, ICMI.

[14]  Guoying Zhao,et al.  Aff-Wild: Valence and Arousal ‘In-the-Wild’ Challenge , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[16]  P. Ekman,et al.  What the face reveals : basic and applied studies of spontaneous expression using the facial action coding system (FACS) , 2005 .

[17]  Yurong Chen,et al.  Capturing AU-Aware Facial Features and Their Latent Relations for Emotion Recognition in the Wild , 2015, ICMI.

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  J. Russell Evidence of Convergent Validity on the Dimensions of Affect , 1978 .

[20]  Jesse Hoey,et al.  From individual to group-level emotion recognition: EmotiW 5.0 , 2017, ICMI.

[21]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[22]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[23]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[24]  Natalia Efremova,et al.  Convolutional neural networks pretrained on large face recognition datasets for emotion classification from video , 2017, ArXiv.

[25]  Guoying Zhao,et al.  Recognition of Affect in the Wild Using Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).