A Text Cube Approach to Human, Social and Cultural Behavior in the Twitter Stream

Twitter is a microblogging website that has been useful as a source for human social behavioral analysis, such as political sentiment analysis, user influence, and spread of news. In this paper, we discuss a text cube approach to studying different kinds of human, social and cultural behavior (HSCB) embedded in the Twitter stream. Text cube is a new way to organize data (e.g., Twitter text) in multiple dimensions and multiple hierarchies for efficient information query and visualization. With the HSCB measures defined in a cube, users are able to view statistical reports and perform online analytical processing. Along with viewing and analyzing Twitter text using cubes and charts, we have also added the capability to display the contents of the cube on a heat map. The degree of opacity is directly proportional to the value of the behavioral, social or cultural measure. This kind of map allows the analyst to focus attention on hotspots of concern in a region of interest. In addition, the text cube architecture supports the development of data mining models using the data taken from cubes. We provide several case studies to illustrate the text cube approach, including public sentiment in a U.S. city and political sentiment in the Arab Spring.

[1]  ChengXiang Zhai,et al.  MiTexCube: MicroTextCluster Cube for online analysis of text cells and its applications , 2011, CIDU.

[2]  Ian Witten,et al.  Data Mining , 2000 .

[3]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[4]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[5]  J. Russell A circumplex model of affect. , 1980 .

[6]  Jeffrey T. Hancock,et al.  I'm sad you're sad: emotional contagion in CMC , 2008, CSCW.

[7]  Jeffrey T. Hancock,et al.  Expressing emotion in text-based communication , 2007, CHI.

[8]  Scott A. Golder,et al.  Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures , 2011 .

[9]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[10]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[11]  Bo Zhao,et al.  TEXplorer: keyword-based object search and exploration in multidimensional text databases , 2011, CIKM '11.

[12]  Adam D. I. Kramer An unobtrusive behavioral model of "gross national happiness" , 2010, CHI.

[13]  Jiawei Han,et al.  SocialCube: A Text Cube Framework for Analyzing Social Media Data , 2012, 2012 International Conference on Social Informatics.

[14]  J. Burgoon,et al.  Nonverbal Communication , 2018, Encyclopedia of Evolutionary Psychological Science.

[15]  Kristina Lerman,et al.  Information Contagion: An Empirical Study of the Spread of News on Digg and Twitter Social Networks , 2010, ICWSM.

[16]  Kaizhi Tang,et al.  An agent-based framework for collaborative data mining optimization , 2010, 2010 International Symposium on Collaborative Technologies and Systems.

[17]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[18]  Winton Bates,et al.  Gross National Happiness , 2009 .

[19]  Christopher Brown Evolution of Sentiment in the Libyan Revolution , 2011 .

[20]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[21]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[22]  G. Clore,et al.  Mood, misattribution, and judgments of well-being: Informative and directive functions of affective states. , 1983 .

[23]  Bo Zhao,et al.  Text Cube: Computing IR Measures for Multidimensional Text Database Analysis , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[24]  Xiong Liu Exploring Linguistic Features for Deception Detection in Unstructured Text , 2012 .

[25]  Jiawei Han,et al.  Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases , 2009, SDM.

[26]  Kaizhi Tang,et al.  ABMiner: A Scalable Data Mining Framework to Support Human Performance Analysis , 2010 .

[27]  Andreas Tolk,et al.  Emerging M&S challenges for human, social, cultural, and behavioral modeling , 2009 .