Self-disclosure topic model for Twitter conversations

Self-disclosure, the act of revealing oneself to others, is an important social behavior that contributes positively to intimacy and social support from others. It is a natural behavior, and social scientists have carried out numerous quantitative analyses of it through manual tagging and survey questionnaires. Recently, the flood of data from online social networks (OSN) offers a practical way to observe and analyze self-disclosure behavior at an unprecedented scale. The challenge with such analysis is that OSN data come with no annotations, and it would be impossible to manually annotate the data for a quantitative analysis of self-disclosure. As a solution, we propose a semi-supervised machine learning approach, using a variant of latent Dirichlet allocation for automatically classifying self-disclosure in a massive dataset of Twitter conversations. For measuring the accuracy of our model, we manually annotate a small subset of our dataset, and we show that our model shows significantly higher accuracy and F-measure than various other methods. With the results our model, we uncover a positive and significant relationship between self-disclosure and online conversation frequency over time.

[1]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[2]  Wendy M. Grossman I Seek You. , 2002 .

[3]  N. Ellison,et al.  Social capital, self-esteem, and use of online social network sites: A longitudinal analysis , 2008 .

[4]  Louis Leung,et al.  Loneliness, Self-Disclosure, and ICQ ("I Seek You") Use , 2002, Cyberpsychology Behav. Soc. Netw..

[5]  Alice H. Oh,et al.  A Hierarchical Aspect-Sentiment Model for Online Reviews , 2013, AAAI.

[6]  Alice H. Oh,et al.  Self-Disclosure and Relationship Strength in Twitter Conversations , 2012, ACL.

[7]  C. MogotsiI. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze , 2010 .

[8]  Jason P. Mitchell,et al.  Disclosing information about the self is intrinsically rewarding , 2012, Proceedings of the National Academy of Sciences.

[9]  S. Duck Human Relationships , 1991 .

[10]  Danah Boyd,et al.  Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[11]  Arjun Mukherjee,et al.  Public Dialogue: Analysis of Tolerance in Online Discussions , 2013, ACL.

[12]  Jocelyn M. DeGroot,et al.  Attitudes Toward Online Social Connection and Self-Disclosure as Predictors of Facebook Communication and Relational Closeness , 2011, Commun. Res..

[13]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[14]  Azy Barak,et al.  Degree and Reciprocity of Self-Disclosure in Online Forums , 2007, Cyberpsychology Behav. Soc. Netw..

[15]  Adam N. Joinson,et al.  Linguistic Markers of Secrets and Sensitive Self-Disclosure in Twitter , 2012, 2012 45th Hawaii International Conference on System Sciences.

[16]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[17]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[18]  S. Vondracek,et al.  The manipulation and measurement of self-disclosure in preadolescents. , 1971 .

[19]  Melissa D. Begg,et al.  Joseph L. Fleiss , 2005 .

[20]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[21]  Thomas Ashby Wills,et al.  Supportive functions of interpersonal relationships. , 1985 .

[22]  Susan T. Dumais,et al.  Mark my words!: linguistic style accommodation in social media , 2011, WWW.

[23]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[24]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[26]  Self Book Self Disclosure An Experimental Analysis Of The Transparent Self , 2016 .

[27]  Max Welling,et al.  Distributed Algorithms for Topic Models , 2009, J. Mach. Learn. Res..

[28]  A. Joinson,et al.  Self-disclosure, Privacy and the Internet , 2009 .

[29]  Alan Ritter,et al.  Unsupervised Modeling of Twitter Conversations , 2010, NAACL.

[30]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[31]  T. Grance,et al.  SP 800-122. Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) , 2010 .

[32]  Alice H. Oh,et al.  Do You Feel What I Feel? Social Aspects of Emotions in Twitter Conversations , 2012, ICWSM.

[33]  Leonard Reinecke,et al.  The reciprocal effects of social network site use and the disposition for self-disclosure: A longitudinal study , 2013, Comput. Hum. Behav..