Profiling analysis of DISC personality traits based on Twitter posts in Bahasa Indonesia

Abstract Data in the timeline of social media users consists of data in the form of text, images, audio, and video. Large and unstructured data in social media can be processed using various techniques such as text processing or image processing. In this study, the processed text data is used to classify Twitter users’ personality based on the DISC framework. Out of the initial collected 292 users, we semi-automatically filtered them for only personal accounts with Indonesian language posts. For being able to observe and assess a user’s personality out of their tweets choice of words, we made relevant keyword vocabularies corresponding to DISC framework and theory. There are four experiment scenarios done in this study, with variations on whether the keywords and text data are stemmed or not, and the keywords frequency calculation being weighted or not. Weighting the keywords using the current number in calculation based on their level does not show positive results, neither does stemming as the best results are shown by the not stemmed and not weighted scenario. This study is a preliminary research for an automatic profiling system which employs a combination of Natural Language Processing and Machine Learning approaches.

[1]  Ema Utami,et al.  Natural Language Processing and Lexical Approach for Depression Symptoms Screening of Indonesian Twitter User , 2018, 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE).

[2]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[3]  Qun Jin,et al.  Influence analysis of emotional behaviors and user relationships based on Twitter data , 2018 .

[4]  W. Marston Emotions of normal people , 1928 .

[5]  Shaojie Qiao,et al.  An emotional contagion model for heterogeneous social media with multiple behaviors , 2018 .

[6]  Nadeem Ahmad,et al.  Personality Assessment using Twitter Tweets , 2017, KES.

[7]  D. Goleman Working with Emotional Intelligence , 1998 .

[8]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[9]  Thomas Wetter,et al.  Screening Internet forum participants for depression symptoms by assembling and enhancing multiple NLP methods , 2015, Comput. Methods Programs Biomed..

[10]  Ayu Purwarianti,et al.  InaNLP: Indonesia natural language processing toolkit, case study: Complaint tweet classification , 2016, 2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA).

[11]  A F Hidayatullah,et al.  Pre-processing Tasks in Indonesian Twitter Messages , 2017 .

[12]  Williem,et al.  Personality prediction based on Twitter information in Bahasa Indonesia , 2017, 2017 Federated Conference on Computer Science and Information Systems (FedCSIS).

[13]  Gregory J. Meyer,et al.  Structural convergence of mood and personality: Evidence for old and new directions. , 1989 .

[14]  Soheila Ashkezari-Toussi,et al.  Emotional maps based on social networks data to analyze cities emotional structure and measure their emotional similarity , 2019, Cities.

[15]  Alexis A. Fink,et al.  The Effects of Personality and Management Role on Perceived Values in Business Settings , 2000 .

[16]  Nitin Indurkhya,et al.  Handbook of Natural Language Processing , 2010 .

[17]  Yi Grace Ji,et al.  Functional and emotional traits of corporate social media message strategies: Behavioral insights from S&P 500 Facebook data , 2019, Public Relations Review.

[18]  R. L. Greene,et al.  Handbook of Personality Assessment , 2007 .

[19]  Fei Su,et al.  Modeling of User Portrait Through Social Media , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).