A Deep Network for Joint Analysis of Apparent Personality, Emotion and Their Relationship

Apparent personality and emotion analysis are both central to affective computing. Existing works solve them individually. In this paper we investigate if such high-level affect traits and their relationship can be jointly learned from face images in the wild. To this end, we introduce PersEmoN, an end-to-end trainable and deep Siamese-like network. It consists of two convolutional network branches, one for emotion and the other for apparent personality. Both networks share their bottom feature extraction module and are optimized within a multi-task learning framework. Emotion and personality networks are dedicated to their own annotated dataset. Furthermore, an adversarial-like loss function is employed to promote representation coherence among heterogeneous dataset sources. Based on this, we also explore the emotion-to-apparent-personality relationship. Extensive experiments demonstrate the effectiveness of PersEmoN.

[1]  Wei-Yi Chang,et al.  FATAUVA-Net: An Integrated Deep Learning Framework for Facial Attribute Recognition, Action Unit Detection, and Valence-Arousal Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  S. Gosling,et al.  Personality and Social Psychology Bulletin Personality Judgments Based on Physical Appearance Personality Judgments Based on Physical Appearance , 2022 .

[3]  Stéphane Ayache,et al.  Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos , 2018, ArXiv.

[4]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Sergio Escalera,et al.  First Impressions: A Survey on Computer Vision-Based Apparent Personality Trait Analysis , 2018, ArXiv.

[6]  Sethuraman Panchanathan,et al.  Multimodal emotion recognition using deep learning architectures , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  K. Scherer,et al.  Personality and emotion , 2009 .

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Xiu-Shen Wei,et al.  Deep Bimodal Regression for Apparent Personality Analysis , 2016, ECCV Workshops.

[10]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[11]  Jesse Hoey,et al.  First Impressions - Predicting User Personality from Twitter Profile Images , 2016, HBU.

[12]  Sergio Escalera,et al.  Multimodal First Impression Analysis with Deep Residual Networks , 2018, IEEE Transactions on Affective Computing.

[13]  Albert Ali Salah,et al.  Combining Deep Facial and Ambient Features for First Impression Estimation , 2016, ECCV Workshops.

[14]  Sergio Escalera,et al.  ChaLearn LAP 2016: First Round Challenge on First Impressions - Dataset and Results , 2016, ECCV Workshops.

[15]  James A. Russell,et al.  Predicting the Big Two of Affect from the Big Five of Personality , 2001 .

[16]  Weisi Lin,et al.  Do Others Perceive You As You Want Them To?: Modeling Personality based on Selfies , 2015, ASM@ACM Multimedia.

[17]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, ICCV.

[18]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[19]  Shuicheng Yan,et al.  Estimation of Affective Level in the Wild with Multiple Memory Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Guoying Zhao,et al.  Aff-Wild: Valence and Arousal ‘In-the-Wild’ Challenge , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Albert Ali Salah,et al.  Multimodal fusion of audio, scene, and face features for first impression estimation , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[22]  Marcel van Gerven,et al.  Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition , 2016, ECCV Workshops.

[23]  Angelo Cangelosi,et al.  Emotion recognition in the wild using deep neural networks and Bayesian classifiers , 2017, ICMI.

[24]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[25]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[26]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[27]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[29]  David Masip,et al.  Interpreting CNN Models for Apparent Personality Trait Regression , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[30]  Mohammad H. Mahoor,et al.  Facial Affect Estimation in the Wild Using Deep Residual and Convolutional Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31]  Jesús Martínez del Rincón,et al.  Recurrent Convolutional Network for Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  J. Russell A circumplex model of affect. , 1980 .

[33]  Christopher Y. Olivola,et al.  Fooled by first impressions? Reexamining the diagnostic value of appearance-based inferences , 2010 .

[34]  Janine Willis,et al.  First Impressions , 2006, Psychological science.

[35]  Marianne Winslett,et al.  Give Me One Portrait Image, I Will Tell You Your Emotion and Personality , 2018, ACM Multimedia.

[36]  Honglak Lee,et al.  Deep learning for robust feature generation in audiovisual emotion recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[38]  Le Zhang,et al.  A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues , 2018, ArXiv.

[39]  Alessandro Vinciarelli,et al.  A Survey of Personality Computing , 2014, IEEE Transactions on Affective Computing.

[40]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[41]  Anurag Mittal,et al.  Bi-modal First Impressions Recognition Using Temporally Ordered Deep Audio and Stochastic Visual Features , 2016, ECCV Workshops.

[42]  Sergio Escalera,et al.  ChaLearn Joint Contest on Multimedia Challenges Beyond Visual Analysis: An overview , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[43]  Hatice Gunes,et al.  Automatic Prediction of Impressions in Time and across Varying Context: Personality, Attractiveness and Likeability , 2017, IEEE Transactions on Affective Computing.

[44]  R. Depue,et al.  Neurobiology of the structure of personality: Dopamine, facilitation of incentive motivation, and extraversion , 1999, Behavioral and Brain Sciences.