Beyond the Words: Predicting User Personality from Heterogeneous Information

An incisive understanding of user personality is not only essential to many scientific disciplines, but also has a profound business impact on practical applications such as digital marketing, personalized recommendation, mental diagnosis, and human resources management. Previous studies have demonstrated that language usage in social media is effective in personality prediction. However, except for single language features, a less researched direction is how to leverage the heterogeneous information on social media to have a better understanding of user personality. In this paper, we propose a Heterogeneous Information Ensemble framework, called HIE, to predict users' personality traits by integrating heterogeneous information including self-language usage, avatar, emoticon, and responsive patterns. In our framework, to improve the performance of personality prediction, we have designed different strategies extracting semantic representations to fully leverage heterogeneous information on social media. We evaluate our methods with extensive experiments based on a real-world data covering both personality survey results and social media usage from thousands of volunteers. The results reveal that our approaches significantly outperform several widely adopted state-of-the-art baseline methods. To figure out the utility of HIE in a real-world interactive setting, we also present DiPsy, a personalized chatbot to predict user personality through heterogeneous information in digital traces and conversation logs.

[1]  W. Mischel Introduction to personality , 1971 .

[2]  Murray R. Barrick,et al.  THE BIG FIVE PERSONALITY DIMENSIONS AND JOB PERFORMANCE: A META-ANALYSIS , 1991 .

[3]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[4]  P. Ekman An argument for basic emotions , 1992 .

[5]  P. Costa,et al.  Revised NEO Personality Inventory (NEO-PI-R) and NEO-Five-Factor Inventory (NEO-FFI) , 1992 .

[6]  O. John,et al.  Los Cinco Grandes across cultures and ethnic groups: multitrait multimethod analyses of the Big Five in Spanish and English. , 1998, Journal of personality and social psychology.

[7]  K. Ford Brands Laid Bare: Using Market Research for Evidence-Based Brand Management , 2005 .

[8]  John A. Johnson,et al.  The international personality item pool and the future of public-domain personality measures ☆ , 2006 .

[9]  Sara B. Kiesler,et al.  The Ideal Elf: Identity Exploration in World of Warcraft , 2007, Cyberpsychology Behav. Soc. Netw..

[10]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[11]  O. John,et al.  Paradigm shift to the integrative Big Five trait taxonomy: History, measurement, and conceptual issues. , 2008 .

[12]  Jared Piazza,et al.  Evolutionary cyber-psychology: Applying an evolutionary framework to Internet behavior , 2009, Comput. Hum. Behav..

[13]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[14]  Tal Yarkoni Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers. , 2010, Journal of research in personality.

[15]  P. Costa,et al.  NEO inventories for the NEO Personality Inventory-3 (NEO-PI-3), NEO Five-Factor Inventory-3 (NEO-FFI-3), NEO Personality Inventory-Revised (NEO PI-R) : professional manual , 2010 .

[16]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[17]  Rosanna E. Guadagno,et al.  My avatar and me - Gender and personality predictors of avatar-self discrepancy , 2012, Comput. Hum. Behav..

[18]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.

[19]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[20]  В С Кашпарова,et al.  О построении новых психометрических шкал на основе многошкального личностного опросника neo PI-R , 2013 .

[21]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[22]  Michelle X. Zhou,et al.  KnowMe and ShareMe: understanding automatically discovered personality traits from social media and user sharing preferences , 2014, CHI.

[23]  M. Kosinski,et al.  Computer-based personality judgments are more accurate than those made by humans , 2015, Proceedings of the National Academy of Sciences.

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yueting Zhuang,et al.  User Preference Learning for Online Social Recommendation , 2016, IEEE Transactions on Knowledge and Data Engineering.

[26]  Yueting Zhuang,et al.  Expert Finding for Community-Based Question Answering via Ranking Metric Network Learning , 2016, IJCAI.