A time decoupling approach for studying forum dynamics

Online forums are rich sources of information about user communication activity over time. Finding temporal patterns in online forum communication threads can advance our understanding of the dynamics of conversations. The main challenge of temporal analysis in this context is the complexity of forum data. There can be thousands of interacting users, who can be numerically described in many different ways. Moreover, user characteristics can evolve over time. We propose an approach that decouples temporal information about users into sequences of user events and inter-event times. We develop a new feature space to represent the event sequences as paths, and we model the distribution of the inter-event times. We study over 30,000 users across four Internet forums, and discover novel patterns in user communication. We find that users tend to exhibit consistency over time. Furthermore, in our feature space, we observe regions that represent unlikely user behaviors. Finally, we show how to derive a numerical representation for each forum, and we then use this representation to derive a novel clustering of multiple forums.

[1]  Pi-Fang Hsu,et al.  Selection model based on ANP and GRA for independent media agencies , 2012 .

[2]  Rafael E. Banchs,et al.  The structure of political discussion networks: a model for the analysis of online deliberation , 2010, J. Inf. Technol..

[3]  Danyel Fisher,et al.  Picturing Usenet: Mapping Computer-Mediated Collective Action , 2005, J. Comput. Mediat. Commun..

[4]  Dirk P. Kroese,et al.  Kernel density estimation via diffusion , 2010, 1011.2602.

[5]  Barry Wellman,et al.  It's not who you know, it's how you know them: Who exchanges what with whom? , 2007, Soc. Networks.

[6]  Christos Boutsidis,et al.  Unsupervised Feature Selection for the $k$-means Clustering Problem , 2009, NIPS.

[7]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[8]  Timothy Baldwin,et al.  Mining Micro-blogs: Opportunities and Challenges , 2012, Computational Social Networks.

[9]  Vasja Vehovar,et al.  Posting, quoting, and replying: a comparison of methodological approaches to measure communication ties in web forums , 2012 .

[10]  Fernanda B. Viégas,et al.  Newsgroup Crowds and AuthorLines: visualizing the activity of individuals in conversational cyberspaces , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[11]  Erik Aumayr,et al.  Reconstruction of Threaded Conversations in Online Discussion Forums , 2011, ICWSM.

[12]  Li Wang,et al.  Thread-level Analysis over Technical User Forum Data , 2010, ALTA.

[13]  Michael Gertz,et al.  Mining email social networks , 2006, MSR '06.

[14]  Lada A. Adamic,et al.  Tracking information epidemics in blogspace , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[15]  Timothy Baldwin,et al.  Classifying User Forum Participants: Separating the Gurus from the Hacks, and Other Tales of the Internet , 2010, ALTA.

[16]  Jiawei Han,et al.  The Joint Inference of Topic Diffusion and Evolution in Social Communities , 2011, 2011 IEEE 11th International Conference on Data Mining.

[17]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[18]  Eamonn J. Keogh,et al.  Exact Discovery of Time Series Motifs , 2009, SDM.

[19]  James Bailey,et al.  A binary decision diagram based approach for mining frequent subsequences , 2010, Knowledge and Information Systems.

[20]  Jiawei Han,et al.  Inferring the Diffusion and Evolution of Topics in Social Communities , 2011 .

[21]  Mikolaj Morzy,et al.  An Analysis of Communities in Different Types of Online Forums , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[22]  Eamonn J. Keogh,et al.  Segmenting Time Series: A Survey and Novel Approach , 2002 .

[23]  Li Wang,et al.  Tagging and Linking Web Forum Posts , 2010, CoNLL.

[24]  Danyel Fisher,et al.  Visualizing the Signatures of Social Roles in Online Discussion Groups , 2007, J. Soc. Struct..

[25]  Elizabeth M. Daly,et al.  Decomposing Discussion Forums and Boards Using User Roles , 2010, ICWSM.

[26]  Eric J. Miller,et al.  AGENCY IN SOCIAL ACTIVITY INTERACTIONS: THE ROLE OF SOCIAL NETWORKS IN TIME AND SPACE , 2008 .

[27]  Junjie Yao,et al.  Bursty event detection from collaborative tags , 2011, World Wide Web.

[28]  Virgílio A. F. Almeida,et al.  Characterizing user behavior in online social networks , 2009, IMC '09.

[29]  Adilson E. Motter,et al.  A Poissonian explanation for heavy tails in e-mail communication , 2008, Proceedings of the National Academy of Sciences.

[30]  Pavel Pudil,et al.  Efficient Feature Subset Selection and Subset Size Optimization , 2010 .

[31]  Michael Gertz,et al.  Mining email social networks in Postgres , 2006, MSR '06.

[32]  Judith S. Donath,et al.  PeopleGarden: creating data portraits for users , 1999, UIST '99.

[33]  Philip S. Yu,et al.  Leadership discovery when data correlatively evolve , 2010, World Wide Web.

[34]  Ravi Kumar,et al.  Dynamics of conversations , 2010, KDD.

[35]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.