Leveraging the Deep Learning Paradigm for Continuous Affect Estimation from Facial Expressions

Continuous affect estimation from facial expressions has attracted increased attention in the affective computing research community. This paper presents a principled framework for estimating continuous affect from video sequences. Based on recent developments, we address the problem of continuous affect estimation by leveraging the Bayesian filtering paradigm, i.e. considering affect as a latent dynamical system corresponding to a general feeling of pleasure with a degree of arousal, and recursively estimating its state using a sequence of visual observations. To this end, we advance the state-of-the-art as follows: (i) Canonical face representation (CFR): a novel algorithm for two-dimensional face frontalization, (ii) Convex unsupervised representation learning (CURL): a novel frequency-domain convex optimization algorithm for unsupervised training of deep convolutional neural networks (CNN)s, and (iii) Deep extended Kalman filtering (DEKF): an extended Kalman filtering-based algorithm for affect estimation from a sequence of deep CNN observations. The performance of the resulting CFR-CURL-DEKF algorithmic framework is empirically evaluated on publicly available benchmark datasets for facial expression recognition (CK+) and continuous affect estimation (AVEC 2012 and 2014).