Group Profiling for Understanding Social Structures

The prolific use of participatory Web and social networking sites is reshaping the ways in which people interact with one another. It has become a vital part of human social life in both the developed and developing world. People sharing certain similarities or affiliates tend to form communities within social media. At the same time, they participate in various online activities: content sharing, tagging, posting status updates, etc. These diverse activities leave behind traces of their social life, providing clues to understand changing social structures. A large body of existing work focuses on extracting cohesive groups based on network topology. But little attention is paid to understanding the changing social structures. In order to help explain the formation of a group, we explore different group-profiling strategies to construct descriptions of a group. This research can assist network navigation, visualization, and analysis, as well as monitoring and tracking the ebbs and tides of different groups in evolving networks. By exploiting information collected from real-world social media sites, extensive experiments are conducted to evaluate group-profiling results. The pros and cons of different group-profiling strategies are analyzed with concrete examples. We also show some potential applications based on group profiling. Interesting findings with discussions are reported.

[1]  John Kelly and Bruce Etling Mapping Iran's online public: Politics and culture in the Persian blogosphere , 2008 .

[2]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[3]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[4]  David Konopnicki,et al.  Extracting user profiles from large scale data , 2010, MDAC '10.

[5]  Judith S. Donath,et al.  Homophily in online dating: when do you like someone like yourself? , 2005, CHI Extended Abstracts.

[6]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[7]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[8]  Edward Y. Chang,et al.  Collaborative filtering for orkut communities: discovery of user latent behavior , 2009, WWW '09.

[9]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[10]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[11]  Joshua B. Tenenbaum,et al.  Learning annotated hierarchies from relational data , 2006, NIPS.

[12]  M. Thelwall Homophily in MySpace , 2009, J. Assoc. Inf. Sci. Technol..

[13]  Alexandros Ntoulas,et al.  Homophily in the Digital World: A LiveJournal Case Study , 2010, IEEE Internet Computing.

[14]  Huan Liu,et al.  Toward Predicting Collective Behavior via Social Dimension Extraction , 2010, IEEE Intelligent Systems.

[15]  M. Abrahamson,et al.  Principles of Group Solidarity. , 1988 .

[16]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[17]  Alexander J. Smola,et al.  A Compression Framework for Generating User Profiles , 2010 .

[18]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[19]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[20]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Huan Liu,et al.  Discovering Overlapping Groups in Social Media , 2010, 2010 IEEE International Conference on Data Mining.

[22]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[23]  Malik Magdon-Ismail,et al.  Discovering Hidden Groups in Communication Networks , 2004, ISI.

[24]  Philip S. Yu,et al.  Identifying the influential bloggers in a community , 2008, WSDM '08.

[25]  B. Wellman The School Child’s Choice of Companions , 1926 .

[26]  Jure Leskovec,et al.  Microscopic evolution of social networks , 2008, KDD.

[27]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[28]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[29]  Tanya Y. Berger-Wolf,et al.  A framework for community identification in dynamic social networks , 2007, KDD '07.

[30]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[31]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[32]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[33]  Huan Liu,et al.  Scalable learning of collective behavior based on sparse social dimensions , 2009, CIKM.

[34]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[35]  James Caverlee,et al.  Transient crowd discovery on the real-time social web , 2011, WSDM '11.

[36]  Yu Wang,et al.  Community-based greedy algorithm for mining top-K influential nodes in mobile social networks , 2010, KDD.

[37]  Philip S. Yu,et al.  GraphScope: parameter-free mining of large time-evolving graphs , 2007, KDD '07.

[38]  Karen Sparck Jones A statistical interpretation of term specificity and its application in retrieval , 1972 .

[39]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[40]  Huan Liu,et al.  Topic taxonomy adaptation for group profiling , 2008, TKDD.

[41]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[42]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[43]  A. Barabasi,et al.  Quantifying social group evolution , 2007, Nature.

[44]  David M. Blei,et al.  Connections between the lines: augmenting social networks with text , 2009, KDD.

[45]  Craig MacDonald,et al.  Blog track research at TREC , 2010, SIGF.

[46]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[47]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[48]  Huan Liu,et al.  Bias analysis in text classification for highly skewed data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[49]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[50]  Bart Selman,et al.  Tracking evolving communities in large linked networks , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[51]  LiuHuan,et al.  Group Profiling for Understanding Social Structures , 2011 .