Blog tells what kind of personality you have: egogram estimation from Japanese weblog

In this paper, we investigate personality estimation from Japanese weblog text. Among various personality types, we focus on Egogram, which has been used in Transactional Analysis and is strongly related to the communicative behavior of individuals. Estimation is performed using the Multinomial Naïve Bayes classifier with some feature words that are selected based on the information gain. The validity of this approach was evaluated with real weblog text of 551 subjects. The results show that our approach achieved 12-25% improvement from baseline. The feature words selected for the estimation are strongly correlated with the characteristics of Egogram.

[1]  Marilyn A. Walker,et al.  Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text , 2007, J. Artif. Intell. Res..

[2]  Jeffrey T. Hancock,et al.  Expressing emotion in text-based communication , 2007, CHI.

[3]  P. Costa,et al.  NEO inventories for the NEO Personality Inventory-3 (NEO-PI-3), NEO Five-Factor Inventory-3 (NEO-FFI-3), NEO Personality Inventory-Revised (NEO PI-R) : professional manual , 2010 .

[4]  John M. Dusay Egograms and the “Constancy Hypothesis”: , 1972 .

[5]  P. Costa,et al.  Revised NEO Personality Inventory (NEO-PI-R) and NEO-Five-Factor Inventory (NEO-FFI) , 1992 .

[6]  G. A. Mishne,et al.  Expiriments with mood classification in blog posts , 2005, SIGIR 2005.

[7]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[8]  Ian Witten,et al.  Data Mining , 2000 .

[9]  Jeffrey T. Hancock,et al.  Reading between the lines: linguistic cues to deception in online dating profiles , 2010, CSCW '10.

[10]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[11]  Jon Oberlander,et al.  What Are They Blogging About? Personality, Topic and Motivation in Blogs , 2009, ICWSM.

[12]  Darren Gergle,et al.  Emotion rating from short blog texts , 2008, CHI.

[13]  Jon Oberlander,et al.  Identifying more bloggers: Towards large scale personality classification of personal weblogs , 2007, ICWSM.

[14]  M. Walker,et al.  Words Mark the Nerds: Computational Models of Personality Recognition through Language , 2006 .

[15]  Jon Oberlander,et al.  Whose Thumb Is It Anyway? Classifying Author Personality from Weblog Text , 2006, ACL.

[16]  Claire Cardie,et al.  Identifying Expressions of Opinion in Context , 2007, IJCAI.