Multimodal Self-Assessed Personality Estimation During Crowded Mingle Scenarios Using Wearables Devices and Cameras

This paper focuses on the automatic classification of self-assessed personality traits from the HEXACO inventory during crowded mingle scenarios. These scenarios provide rich study cases for social behavior analysis but are also challenging to analyze automatically as people in them interact dynamically and freely in an in-the-wild face-to-face setting. To do so, we leverage the use of wearable sensors recording acceleration and proximity, and video from overhead cameras. We use 3 different behavioral modality types (movement, speech and proximity) coming from 2 sensors (wearable and camera). Unlike other works, we extract an individual’s speaking status from a single body worn triaxial accelerometer instead of audio, which scales easily to large populations. Additionally, we study the effect of different combinations of modality types on the personality estimation, and how this relates to the nature of each trait. We also include an analysis of feature complementarity and an evaluation of feature importance for the classification, showing that combining complementary modality types further improves the classification performance. We estimate the self-assessed personality traits both using a binary classification (community’s standard) and as a regression over the trait scores. Finally, we analyze the impact of the accuracy of the speech detection on the overall performance of the personality estimation.