Facial Expression Recognition in the Wild Using Multi-Level Features and Attention Mechanisms

Learning discriminative features is of vital importance for automatic Facial Expression Recognition (FER) in the wild. In this paper, we propose a novel Slide-Patch and Whole-Face Attention model with SE blocks (SPWFA-SE), which jointly perceives the discriminative locality characteristics and informative global features of the face for effective FER. Specifically, the well-designed slide patches are proposed to extract local features. Different from the existing methods, our slide patches not only can maintain the information at the edge area of patches, but also do not need to detect facial landmarks. Moreover, to make the model adaptively focus on the distinguishable regions, an attention module is proposed in the patch level to learn the weight of each patch. Furthermore, squeeze-and-excitation blocks are explored in the channel level to learn the weight of each channel. As such, the proposed multi-level feature extraction and attention mechanisms can enhance the representative ability of the learned features. Extensive experiments on five challenging datasets demonstrate that our method can achieve state-of-the-art performance. Cross database experiments on another three databases show the superior generalization performance of our model. Furthermore, complexity analysis results show that our model contains fewer parameters with fast training advantages than other competing models.