IExpressNet: Facial Expression Recognition with Incremental Classes

Existing methods on facial expression recognition (FER) are mainly trained in the setting when all expression classes are fixed in advance. However, in real applications, expression classes are becoming increasingly fine-grained and incremental. To deal with sequential expression classes, we can fine-tune or re-train these models, but this often results in poor performance or large computing resources consumption. To address these problems, we develop an Incremental Facial Expression Recognition Network (IExpressNet), which can learn a competitive multi-class classifier at any time with a lower requirement of computing resources. Specifically, IExpressNet consists of two novel components. First, we construct an exemplar set by dynamically selecting representative samples from old expression classes. Then, the exemplar set and new expression classes samples constitute the training set. Second, we design a novel center-expression-distilled loss. As for facial expression in the wild, center-expression-distilled loss enhances the discriminative power of the deeply learned features and prevents catastrophic forgetting. Extensive experiments are conducted on two large-scale FER datasets in the wild, RAF-DB and AffectNet. The results demonstrate the superiority of the proposed method as compared to state-of-the-art incremental learning approaches.

[1]  Shan Li,et al.  Deep Facial Expression Recognition: A Survey , 2018, IEEE Transactions on Affective Computing.

[2]  Sridha Sridharan,et al.  Using Synthetic Data to Improve Facial Expression Analysis with 3D Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[3]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[4]  John K. Tsotsos,et al.  Incremental Learning Through Deep Adaptation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yantao Yu,et al.  An Input-aware Factorization Machine for Sparse Prediction , 2019, IJCAI.

[6]  Andrea Cavallaro,et al.  Automatic Analysis of Facial Affect: A Survey of Registration, Representation, and Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Pascal Belin,et al.  Recognition and discrimination of prototypical dynamic expressions of pain and emotions , 2008, PAIN®.

[8]  Changsheng Xu,et al.  Joint Pose and Expression Modeling for Facial Expression Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Xinlei Chen,et al.  NEIL: Extracting Visual Knowledge from Web Data , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Aleix M. Martínez,et al.  A Model of the Perception of Facial Expressions of Emotion by Humans: Research Overview and Perspectives , 2012, J. Mach. Learn. Res..

[11]  G. Cottrell,et al.  Categorical Perception in Facial Emotion Classification , 1996 .

[12]  Yong Du,et al.  Facial Expression Recognition Based on Deep Evolutional Spatial-Temporal Networks , 2017, IEEE Transactions on Image Processing.

[13]  Joonwhoan Lee,et al.  Fuzzy Similarity-Based Emotional Classification of Color Images , 2011, IEEE Transactions on Multimedia.

[14]  Lijun Yin,et al.  Facial Expression Recognition by De-expression Residue Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[17]  Youbao Tang,et al.  Discrete Probability Distribution Prediction of Image Emotions with Shared Sparse Learning , 2020, IEEE Transactions on Affective Computing.

[18]  Sergio Escalera,et al.  Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Dian Tjondronegoro,et al.  Facial Expression Recognition Using Facial Movement Features , 2011, IEEE Transactions on Affective Computing.

[20]  Takeo Kanade,et al.  Automated facial expression recognition based on FACS action units , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[21]  P. Ekman,et al.  DIFFERENCES Universals and Cultural Differences in the Judgments of Facial Expressions of Emotion , 2004 .

[22]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Yuxin Peng,et al.  Error-Driven Incremental Learning in Deep Convolutional Neural Network for Large-Scale Image Classification , 2014, ACM Multimedia.

[25]  Kurt Keutzer,et al.  EmotionGAN: Unsupervised Domain Adaptation for Learning Discrete Probability Distributions of Image Emotions , 2018, ACM Multimedia.

[26]  Matthieu Guillaumin,et al.  Incremental Learning of NCM Forests for Large-Scale Image Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Tinne Tuytelaars,et al.  Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Mohamed Daoudi,et al.  A Novel Space-Time Representation on the Positive Semidefinite Cone for Facial Expression Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Shan Li,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition , 2019, IEEE Transactions on Image Processing.

[30]  Adrian Popescu,et al.  DeeSIL: Deep-Shallow Incremental Learning , 2018, ECCV Workshops.

[31]  Steffen Lange,et al.  On the power of incremental learning , 2002, Theor. Comput. Sci..

[32]  Nicu Sebe,et al.  Expression Conditional Gan for Facial Expression-to-Expression Translation , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[33]  Max Welling,et al.  Herding dynamical weights to learn , 2009, ICML '09.

[34]  Andrea Vedaldi,et al.  Efficient Parametrization of Multi-domain Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Kristen A. Lindquist,et al.  The brain basis of emotion: A meta-analytic review , 2012, Behavioral and Brain Sciences.

[36]  Dacheng Tao,et al.  Deep Streaming Label Learning , 2020, ICML.

[37]  Shiguang Shan,et al.  Facial Expression Recognition with Inconsistently Annotated Datasets , 2018, ECCV.

[38]  Siyu Zhu,et al.  Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Hatice Gunes,et al.  Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space , 2011, IEEE Transactions on Affective Computing.

[42]  Bo Sun,et al.  Combining Sequential Geometry and Texture Features for Distinguishing Genuine and Deceptive Emotions , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[43]  Yue Gao,et al.  Predicting Personalized Emotion Perceptions of Social Images , 2016, ACM Multimedia.

[44]  Sebastian Thrun,et al.  Lifelong Learning Algorithms , 1998, Learning to Learn.

[45]  Bo Yuan,et al.  A Dual Input-aware Factorization Machine for CTR Prediction , 2020, IJCAI.

[46]  Yue Gao,et al.  Predicting Personalized Image Emotion Perceptions in Social Networks , 2018, IEEE Transactions on Affective Computing.

[47]  Daniel McDuff,et al.  Affect valence inference from facial action unit spectrograms , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[48]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[49]  Jun Ohya,et al.  Extracting Facial Motion Parameters by Tracking Feature Points , 1998, AMCP.

[50]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[51]  Marian Stewart Bartlett,et al.  Classifying Facial Actions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Ronald Kemker,et al.  FearNet: Brain-Inspired Model for Incremental Learning , 2017, ICLR.

[53]  Huafeng Wu,et al.  Residual-Guide Network for Single Image Deraining , 2018, ACM Multimedia.

[54]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[56]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  John W. Paisley,et al.  A Segmentation-aware Deep Fusion Network for Compressed Sensing MRI , 2018, ECCV.

[58]  Robi Polikar,et al.  Learn$^{++}$ .NC: Combining Ensemble of Classifiers With Dynamically Weighted Consult-and-Vote for Efficient Incremental Learning of New Classes , 2009, IEEE Transactions on Neural Networks.

[59]  Amanda C.C. Williams,et al.  Facial expression of pain: An evolutionary account , 2002, Behavioral and Brain Sciences.

[60]  Mohammad H. Mahoor,et al.  Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[61]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[62]  Alan S. Cowen,et al.  Self-report captures 27 distinct categories of emotion bridged by continuous gradients , 2017, Proceedings of the National Academy of Sciences.

[63]  Rui Zhang,et al.  DBSVEC: Density-Based Clustering Using Support Vector Expansion , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[64]  Guoying Zhao,et al.  Recognition of Affect in the Wild Using Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[65]  Wei-Yi Chang,et al.  FATAUVA-Net: An Integrated Deep Learning Framework for Facial Attribute Recognition, Action Unit Detection, and Valence-Arousal Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[66]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[67]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[68]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Yue Gao,et al.  Real-Time Multimedia Social Event Detection in Microblog , 2018, IEEE Transactions on Cybernetics.

[70]  Stefan Rüping,et al.  Incremental Learning with Support Vector Machines , 2001, ICDM.

[71]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[74]  Cynthia Breazeal,et al.  Emotion and sociable humanoid robots , 2003, Int. J. Hum. Comput. Stud..

[75]  Gabriela Csurka,et al.  Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[76]  Sridha Sridharan,et al.  Automatically Detecting Pain in Video Through Facial Action Units , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[77]  L. F. Barrett Are Emotions Natural Kinds? , 2006, Perspectives on psychological science : a journal of the Association for Psychological Science.

[78]  Michael J. Black,et al.  Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion , 1997, International Journal of Computer Vision.

[79]  Martial Hebert,et al.  Growing a Brain: Fine-Tuning by Increasing Model Capacity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[81]  H. Schlosberg Three dimensions of emotion. , 1954, Psychological review.

[82]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.