A Language-independent and Compositional Model for Personality Trait Recognition from Short Texts

There have been many attempts at automatically recognising author personality traits from text, typically incorporating linguistic features with conventional machine learning models, e.g. linear regression or Support Vector Machines. In this work, we propose to use deep-learning-based models with atomic features of text – the characters – to build hierarchical, vectorial word and sentence representations for the task of trait inference. On a corpus of tweets, this method shows state-of-the-art performance across five traits and three languages (English, Spanish and Italian) compared with prior work in author profiling. The results, supported by preliminary visualisation work, are encouraging for the ability to detect complex human traits.

[1]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[2]  J. Pennebaker,et al.  Psychological aspects of natural language. use: our words, our selves. , 2003, Annual review of psychology.

[3]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[4]  Manjula Ramannavar,et al.  A Neural Network Approach to Personality Prediction based on the Big-Five Model , 2015 .

[5]  Timothy Baldwin,et al.  Lexical Normalisation of Short Text Messages: Makn Sens a #twitter , 2011, ACL.

[6]  A. Jefferson Offutt,et al.  An Empirical Evaluation , 1994 .

[7]  Ferran Plà,et al.  Segmenting Target Audiences: Automatic Author Profiling using Tweets: Notebook for PAN at CLEF 2015 , 2015, CLEF.

[8]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9]  Erik Cambria,et al.  Common Sense Knowledge Based Personality Recognition from Text , 2013, MICAI.

[10]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[11]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[12]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[13]  Azucena Montes Rendón,et al.  Tweets Classification using Corpus Dependent Tags, Character and POS N-grams , 2015, CLEF.

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Simone Wannemaker You Just Dont Understand Women And Men In Conversation , 2016 .

[18]  Adam M. Grant,et al.  Rethinking the Extraverted Sales Ideal , 2013, Psychological science.

[19]  O. John,et al.  Measuring personality in one minute or less: A 10-item short version of the Big Five Inventory in English and German , 2007 .

[20]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[21]  Walter Daelemans,et al.  Ensemble Methods for Personality Recognition , 2013, ICWSM 2013.

[22]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[23]  Scott Nowson,et al.  Look! Who's Talking?: Projection of Extraversion Across Different Social Contexts , 2014, WCPR '14.

[24]  Hugo Jair Escalante,et al.  INAOE's Participation at PAN'13: Author Profiling Task Notebook for PAN at CLEF 2013 , 2013, CLEF.

[25]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[26]  Daniel Dichiu,et al.  Automatic Profiling of Twitter Users Based on Their Tweets: Notebook for PAN at CLEF 2015 , 2015, CLEF.

[27]  Christopher D. Manning Computational Linguistics and Deep Learning , 2015, Computational Linguistics.

[28]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[29]  Benno Stein,et al.  Overview of the Author Profiling Task at PAN 2013 , 2013, CLEF.

[30]  D. Tannen You just don't understand: women and men in conversation. morrow , 1990 .

[31]  Chung-Hsien Wu,et al.  Exploiting Turn-Taking Temporal Evolution for Personality Trait Perception in Dyadic Conversations , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[32]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[33]  Walter Daelemans,et al.  TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling , 2016, LREC.

[34]  Caroline Brun,et al.  Motivating Personality-aware Machine Translation , 2015, EMNLP.

[35]  Jon Oberlander,et al.  Weblogs, genres and individual differences , 2005 .

[36]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[37]  Fabio Pianesi,et al.  The Workshop on Computational Personality Recognition 2014 , 2014, ACM Multimedia.

[38]  Myers,et al.  Gifts Differing: Understanding Personality Type , 1980 .

[39]  A. Buss,et al.  Personality Traits , 1973 .

[40]  Caroline Brun,et al.  XRCE Personal Language Analytics Engine for Multilingual Author Profiling: Notebook for PAN at CLEF 2015 , 2015, CLEF.

[41]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[42]  Benno Stein,et al.  Overview of the 3rd Author Profiling Task at PAN 2015 , 2015, CLEF.

[43]  Alastair J. Gill,et al.  Taking Care of the Linguistic Features of Extraversion , 2019, Proceedings of the Twenty-Fourth Annual Conference of the Cognitive Science Society.

[44]  Hugo Jair Escalante,et al.  INAOE's Participation at PAN'15: Author Profiling task , 2015, CLEF.

[45]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.

[46]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[47]  Davide Buscaldi,et al.  A Random Forest Approach for Authorship Profiling , 2015, CLEF.

[48]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[49]  Jon Oberlander,et al.  The Identity of Bloggers: Openness and Gender in Personal Weblogs , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[50]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.