1 Machine Learning Methods for Social Signal Processing

In this chapter we focus on systematization, analysis, and discussion of recent trends in machine learning methods for Social signal processing (SSP)(Pentland 2007). Because social signaling is often of central importance to subconscious decision making that affects everyday tasks (e.g., decisions about risks and rewards, resource utilization, or interpersonal relationships) the need for automated understanding of social signals by computers is a task of paramount importance. Machine learning has played a prominent role in the advancement of SSP over the past decade. This is, in part, due to the exponential increase of data availability that served as a catalyst for the adoption of a new data-driven direction in affective computing. With the difficulty of exact modeling of latent and complex physical processes that underpin social signals, the data has long emerged as the means to circumvent or supplement expertor physics-based models, such as the deformable musculo-sceletal models of the human body, face or hands and its movement, neuro-dynamical models of cognitive perception, or the models of the human vocal production. This trend parallels the role and success of machine learning in related areas, such as computer vision, c.f., (Poppe 2010, Wright et al. 2010, Grauman & Leibe 2011), or audio, speech and language processing, c.f., (Deng & Li 2013), that serve as the core tools for analytic SSP tasks. Rather than emphasize the exhaustive coverage of the many approaches to data-driven SSP, which can be found in excellent surveys (Vinciarelli et al. 2009, Vinciarelli et al. 2012), we seek to present the methods in the context of current modeling challenges. In particular, we identify and discuss two major modeling directions:

[1]  M. Yachida,et al.  Facial expression recognition and its degree estimation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[3]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[4]  Maja Pantic,et al.  Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Takashi Okada,et al.  Covariance and PCA for Categorical Variables , 2005, PAKDD.

[6]  Qiang Ji,et al.  Active and dynamic information fusion for facial expression understanding from image sequences , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Peng Dai,et al.  Artificial Intelligence for Artificial Artificial Intelligence , 2011, AAAI.

[8]  R. Gur,et al.  Automated Facial Action Coding System for dynamic analysis of facial expressions in neuropsychiatric disorders , 2011, Journal of Neuroscience Methods.

[9]  Michael I. Jordan,et al.  A Probabilistic Interpretation of Canonical Correlation Analysis , 2005 .

[10]  Arman Savran,et al.  Regression-based intensity estimation of facial action units , 2012, Image Vis. Comput..

[11]  John McDonald,et al.  Automatic estimation of the dynamics of facial expression using a three-level model of intensity , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[12]  Shaogang Gong,et al.  Dynamic Facial Expression Recognition Using A Bayesian Temporal Manifold Model , 2006, BMVC.

[13]  Jennifer G. Dy,et al.  Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario , 2010, UAI.

[14]  Qiang Ji,et al.  Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Daniel S. Messinger,et al.  A framework for automated measurement of the intensity of non-posed Facial Action Units , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[16]  Michael J. Black,et al.  Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion , 1997, International Journal of Computer Vision.

[17]  Emile A. Hendriks,et al.  Action unit classification using active appearance models and conditional random fields , 2011, Cognitive Processing.

[18]  Sridha Sridharan,et al.  In the Pursuit of Effective Affective Computing: The Relationship Between Features and Registration , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Vladimir Pavlovic,et al.  Automatic Pain Intensity Estimation with Heteroscedastic Conditional Ordinal Random Fields , 2013, ISVC.

[20]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[21]  Roddy Cowie,et al.  Beyond emotion archetypes: Databases for emotion modelling using neural networks , 2005, Neural Networks.

[22]  Caifeng Shan Inferring facial and body language , 2007 .

[23]  J. Beyene,et al.  Potential risk factors associated with human encephalitis: application of canonical correlation analysis , 2011, BMC medical research methodology.

[24]  Maja Pantic,et al.  Fully Automatic Recognition of the Temporal Phases of Facial Actions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Vladimir Pavlovic,et al.  Dynamic Probabilistic CCA for Analysis of Affective Behaviour , 2012, ECCV.

[26]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[27]  Gwen Littlewort,et al.  Fully Automatic Facial Action Recognition in Spontaneous Behavior , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[28]  L. Tucker An inter-battery method of factor analysis , 1958 .

[29]  Roddy Cowie,et al.  FEELTRACE: an instrument for recording perceived emotion in real time , 2000 .

[30]  Jiawei Han,et al.  Spectral Regression for Efficient Regularized Subspace Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[31]  Fernando De la Torre,et al.  Selective Transfer Machine for Personalized Facial Action Unit Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  P. Ekman Pictures of Facial Affect , 1976 .

[33]  Maja Pantic,et al.  A Dynamic Texture-Based Approach to Recognition of Facial Actions and Their Temporal Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Maja Pantic,et al.  Detecting facial actions and their temporal segments in nearly frontal-view face image sequences , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[35]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[36]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[37]  John McDonald,et al.  Investigating the Dynamics of Facial Expression , 2006, ISVC.

[38]  Hatice Gunes,et al.  Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space , 2011, IEEE Transactions on Affective Computing.

[39]  Jan de Leeuw,et al.  Principal component analysis of binary data by iterated singular value decomposition , 2006, Comput. Stat. Data Anal..

[40]  Maja Pantic,et al.  Continuous Pain Intensity Estimation from Facial Expressions , 2012, ISVC.

[41]  Shaogang Gong,et al.  Appearance Manifold of Facial Expression , 2005, ICCV-HCI.

[42]  Qingshan Liu,et al.  Boosting encoded dynamic features for facial expression recognition , 2009, Pattern Recognit. Lett..

[43]  Beat Fasel,et al.  Recognition of asymmetric facial action unit activities and intensities , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[44]  Jun Ohya,et al.  Recognizing multiple persons' facial expressions using HMM based on automatic extraction of significant frames from image sequences , 1997, Proceedings of International Conference on Image Processing.

[45]  Vatcharaporn Esichaikul,et al.  Fuzzy-C-Mean Determines the Principle Component Pairs to Estimate the Degree of Emotion from Facial Expressions , 2005, FSKD.

[46]  Vladimir Pavlovic,et al.  Context-Sensitive Conditional Ordinal Random Fields for Facial Action Intensity Estimation , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[47]  M. Tarr,et al.  Visual Object Recognition , 1996, ISTCS.

[48]  J. Russell,et al.  The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology , 2005, Development and Psychopathology.

[49]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[50]  Hatice Gunes,et al.  Automatic Temporal Segment Detection and Affect Recognition From Face and Body Display , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[51]  Björn W. Schuller,et al.  Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies , 2008, INTERSPEECH.

[52]  Ying-li Tian,et al.  Evaluation of Face Resolution for Expression Analysis , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[53]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[54]  Jeffrey F. Cohn,et al.  Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[55]  Hatice Gunes,et al.  Automatic Segmentation of Spontaneous Data using Dimensional Labels from Multiple Coders , 2010 .

[56]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[57]  Rogério Schmidt Feris,et al.  Manifold Based Analysis of Facial Expression , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[58]  A. Tannenbaum,et al.  Agitation and pain assessment using digital imaging , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[59]  Lifeng Shang,et al.  Nonparametric discriminant HMM and application to facial expression recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Vladimir Pavlovic,et al.  Kernel Conditional Ordinal Random Fields for Temporal Segmentation of Facial Action Units , 2012, ECCV Workshops.

[61]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[62]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[63]  Xiao Li,et al.  Machine Learning Paradigms for Speech Recognition: An Overview , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[64]  Hatice Gunes,et al.  From the Lab to the real world: affect recognition using multiple cues and modalities , 2008 .

[65]  Athanasios Katsamanis,et al.  Tracking changes in continuous emotion states using body language and prosodic cues , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[66]  S. Lai,et al.  Learning partially-observed hidden conditional random fields for facial expression recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Mohammad T. Manzuri Shalmani,et al.  Recognizing Combinations of Facial Action Units with Different Intensity Using a Mixture of Hidden Markov Models and Neural Network , 2010, MCS.

[68]  Fernando De la Torre,et al.  Continuous AU intensity estimation using localized, sparse facial feature space , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[69]  Vladimir Pavlovic,et al.  Structured Output Ordinal Regression for Dynamic Facial Emotion Intensity Prediction , 2010, ECCV.

[70]  M. V. Lamar,et al.  Recognizing facial actions using Gabor wavelets with neutral face average difference , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[71]  Maja Pantic,et al.  Facial action recognition for facial expression analysis from static face images , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[72]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[73]  Qingshan Liu,et al.  RankBoost with l1 regularization for facial expression recognition and intensity estimation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[74]  Fernando De la Torre,et al.  Action unit detection with segment-based SVMs , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[75]  Shrikanth Narayanan,et al.  The USC Creative IT Database: A Multimodal Database of Theatrical Improvisation , 2010 .

[76]  Gwen Littlewort,et al.  Recognizing facial expression: machine learning and application to spontaneous behavior , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[77]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[78]  Haiping Lu,et al.  A survey of multilinear subspace learning for tensor data , 2011, Pattern Recognit..

[79]  Jake K. Aggarwal,et al.  Facial expression recognition with temporal modeling of shapes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[80]  Peng Dai,et al.  Decision-Theoretic Control of Crowd-Sourced Workflows , 2010, AAAI.

[81]  Wei Chu,et al.  New approaches to support vector ordinal regression , 2005, ICML.

[82]  Vladimir Pavlovic,et al.  Multi-output Laplacian dynamic ordinal regression for facial expression recognition and intensity estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[83]  Carlos Busso,et al.  Analysis and Compensation of the Reaction Lag of Evaluators in Continuous Emotional Annotations , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[84]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[85]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[86]  Gerardo Hermosillo,et al.  Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[87]  Ahmed M. Elgammal,et al.  Facial Expression Analysis Using Nonlinear Decomposable Generative Models , 2005, AMFG.

[88]  Garrison W. Cottrell,et al.  Representing Face Images for Emotion Classification , 1996, NIPS.

[89]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[90]  Samuel Kaski,et al.  Probabilistic approach to detecting dependencies between data sets , 2008, Neurocomputing.