A Global Alignment Kernel based Approach for Group-level Happiness Intensity Estimation

With the progress in automatic human behavior understanding, analysing the perceived affect of multiple people has been recieved interest in affective computing community. Unlike conventional facial expression analysis, this paper primarily focuses on analysing the behaviour of multiple people in an image. The proposed method is based on support vector regression with the combined global alignment kernels (GAKs) to estimate the happiness intensity of a group of people. We first exploit Riesz-based volume local binary pattern (RVLBP) and deep convolutional neural network (CNN) based features for characterizing facial images. Furthermore, we propose to use the GAK for RVLBP and deep CNN features, respectively for explicitly measuring the similarity of two group-level images. Specifically, we exploit the global weight sort scheme to sort the face images from group-level image according to their spatial weights, making an efficient data structure to GAK. Lastly, we propose Multiple kernel learning based on three combination strategies for combining two respective GAKs based on RVLBP and deep CNN features, such that enhancing the discriminative ability of each GAK. Intensive experiments are performed on the challenging group-level happiness intensity database, namely HAPPEI. Our experimental results demonstrate that the proposed approach achieves promising performance for group happiness intensity analysis, when compared with the recent state-of-the-art methods.

[1]  Edilson de Aguiar,et al.  Facial expression recognition with Convolutional Neural Networks: Coping with few data and the training sample order , 2017, Pattern Recognit..

[2]  Lorenzo Bruzzone,et al.  Kernel methods for remote sensing data analysis , 2009 .

[3]  Claus Bahlmann,et al.  Online handwriting recognition with support vector machines - a kernel approach , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[4]  Xiao-Ping Zhang,et al.  Arousal content representation of sports videos using dynamic prediction hidden Markov models , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[5]  Javier Hernandez,et al.  Mood meter: counting smiles in the wild , 2012, UbiComp.

[6]  Clement H. C. Leung,et al.  Dynamic Time Warping for Music Retrieval Using Time Series Modeling of Musical Emotions , 2015, IEEE Transactions on Affective Computing.

[7]  Sigal G. Barsade,et al.  Group emotion: A view from top and bottom. , 1998 .

[8]  B. McCune,et al.  Analysis of Ecological Communities , 2002 .

[9]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[10]  Takeo Kanade,et al.  Spatio-temporal Event Classification Using Time-Series Kernel Based Structured Sparsity , 2014, ECCV.

[11]  Matti Pietikäinen,et al.  Analyzing the Affect of a Group of People Using Multi-modal Framework , 2016, ArXiv.

[12]  Roland Göcke,et al.  Finding Happiest Moments in a Social Context , 2012, ACCV.

[13]  Cordelia Schmid,et al.  A time series kernel for action recognition , 2011, BMVC.

[14]  Jiashi Feng,et al.  Happiness level prediction with sequential inputs via multiple regressions , 2016, ICMI.

[15]  Takeo Kanade,et al.  Emotional Expression Classification Using Time-Series Kernels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[16]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[17]  Tamás D. Gedeon,et al.  Automatic Group Happiness Intensity Analysis , 2015, IEEE Transactions on Affective Computing.

[18]  Peter Robinson,et al.  Emotion tracking in music using continuous conditional random fields and relative feature representation , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[19]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Jason Weston,et al.  Gene functional classification from heterogeneous data , 2001, RECOMB.

[21]  Shigeki Sagayama,et al.  Dynamic Time-Alignment Kernel in Support Vector Machine , 2001, NIPS.

[22]  R. Prim Shortest connection networks and some generalizations , 1957 .

[23]  Wenxuan Mou,et al.  Group-level arousal and valence recognition in static images: Face, body and context , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[24]  Dongmei Jiang,et al.  Multimodal continuous affect recognition based on LSTM and multiple kernel learning , 2014, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific.

[25]  Ethem Alpaydin,et al.  Localized Multiple Kernel Regression , 2010, 2010 20th International Conference on Pattern Recognition.

[26]  Matti Pietikäinen,et al.  Riesz-based Volume Local Binary Pattern and A Novel Group Expression Model for Group Happiness Intensity Analysis , 2015, BMVC.

[27]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[28]  Marco Cuturi,et al.  Fast Global Alignment Kernels , 2011, ICML.

[29]  Stefan Winkler,et al.  Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning , 2015, ICMI.

[30]  Ethem Alpaydin,et al.  Localized algorithms for multiple kernel learning , 2013, Pattern Recognit..

[31]  Alessia Saggese,et al.  Action recognition by using kernels on aclets sequences , 2016, Comput. Vis. Image Underst..

[32]  Roland Göcke,et al.  Group expression intensity estimation in videos via Gaussian Processes , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[33]  Ville Ojansivu,et al.  Blur Insensitive Texture Classification Using Local Phase Quantization , 2008, ICISP.

[34]  Thomas Philip Runarsson,et al.  Support vector machines and dynamic time warping for time series , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[35]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[36]  Tomoko Matsui,et al.  A Kernel for Time Series Based on Global Alignments , 2006, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[37]  Bo Sun,et al.  LSTM for dynamic emotion and group emotion recognition in the wild , 2016, ICMI.

[38]  Aleksandra Cerekovic A deep look into group happiness prediction from images , 2016, ICMI.

[39]  Zhenhua Guo,et al.  A Completed Modeling of Local Binary Pattern Operator for Texture Classification , 2010, IEEE Transactions on Image Processing.

[40]  Nicu Sebe,et al.  The more the merrier: Analysing the affect of a group of people in images , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[41]  Javier M. Moguerza,et al.  Improving Support Vector Classification via the Combination of Multiple Sources of Information , 2004, SSPR/SPR.

[42]  Erica Klarreich,et al.  Hello, my name is… , 2014, CACM.

[43]  Sigal G. Barsade,et al.  Mood and Emotions in Small Groups and Work Teams , 2001 .

[44]  Andrew C. Gallagher,et al.  Understanding images of groups of people , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Andrew Zisserman,et al.  Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video , 2006, BMVC.

[46]  Fernando De la Torre,et al.  Canonical Time Warping for Alignment of Human Behavior , 2009, NIPS.