Modeling and predicting personal information dissemination behavior

In this paper, we propose a new way to automatically model and predict human behavior of receiving and disseminating information by analyzing the contact and content of personal communications. A personal profile, called CommunityNet, is established for each individual based on a novel algorithm incorporating contact, content, and time information simultaneously. It can be used for personal social capital management. Clusters of CommunityNets provide a view of informal networks for organization management. Our new algorithm is developed based on the combination of dynamic algorithms in the social network field and the semantic content classification methods in the natural language processing and machine learning literatures. We tested CommunityNets on the Enron Email corpus and report experimental results including filtering, prediction, and recommendation capabilities. We show that the personal behavior and intention are somewhat predictable based on these models. For instance, "to whom a person is going to send a specific email" can be predicted by one's personal social network and content analysis. Experimental results show the prediction accuracy of the proposed adaptive algorithm is 58% better than the social network-based predictions, and is 75% better than an aggregated model based on Latent Dirichlet Allocation with social network enhancement. Two online demo systems we developed that allow interactive exploration of CommunityNet are also discussed.

[1]  Samuel B. Williams,et al.  ASSOCIATION FOR COMPUTING MACHINERY , 2000 .

[2]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[3]  Michael F. Schwartz,et al.  Discovering shared interests using graph analysis , 1993, CACM.

[4]  David Harlan Wood,et al.  Discovering Shared Interests Among People Using Graph Analysis , 1993 .

[5]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[6]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[7]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[8]  R. Papka,et al.  On-line new event detection and tracking , 1998, SIGIR '98.

[9]  M. KleinbergJon Authoritative sources in a hyperlinked environment , 1999 .

[10]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[11]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[12]  Bonnie A. Nardi,et al.  It's Not What You Know, It's Who You Know: Work in the Information Age , 2000, First Monday.

[13]  Jeff A. Johnson,et al.  ContactMap : Integrating Communication and Information Through Visualizing Personal Social Networks , 2001 .

[14]  Jeff A. Johnson,et al.  Integrating communication and information through ContactMap , 2002, CACM.

[15]  Martin Kilduff,et al.  Structure, culture and Simmelian ties in entrepreneurial firms , 2002, Soc. Networks.

[16]  Yiming Yang,et al.  Stochastic link and group detection , 2002, AAAI/IAAI.

[17]  C. Lee Giles,et al.  Self-Organization and Identification of Web Communities , 2002, Computer.

[18]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[19]  Bernardo A. Huberman,et al.  Email as spectroscopy: automated discovery of community structure within organizations , 2003 .

[20]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[21]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[22]  Steven Durlauf,et al.  Social Capital , 2004 .

[23]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[24]  T. Snijders Models for longitudinal network datain , 2005 .

[25]  Andrew McCallum,et al.  The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email , 2005 .

[26]  Ching-Yung Lin,et al.  ExpertiseNet: Relational and Evolutionary Expert Modeling , 2005, User Modeling.

[27]  David R. Hunter,et al.  Curved Exponential Family Models for Networks , 2005 .

[28]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..