Entity Role Discovery in Hierarchical Topical Communities

People and social communities are often characterized by the topics and themes they are working on, or communicating about. Discovering the roles played by dierent entities in these communities are of great interest in many real-world contexts in social network analysis. We are also often interested in discovering such roles at dierent levels of granularity. In this paper we study a new problem of mining entity roles in hierarchical topical communities. We rst detect topical communities from the text component of a social or information network. Since we mine phrases from the network, and represent topical communities by ranked lists of mixed-length phrases, the communities have a good interpretation at multiple levels of the hierarchy. We are therefore able to discover topical roles of dierent types of entities in both large communities encompassing more general topics, and small, focused subcommunities. We demonstrate our method on a bibliographic information network dataset, which we use to discover the roles of authors and publication venues in the context of the hierarchical topical communities.

[1]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[2]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[3]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[4]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[5]  Yinan Zhang,et al.  A phrase mining framework for recursive construction of a topical hierarchy , 2013, KDD.

[6]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[7]  A. McCallum,et al.  Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[8]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email , 2007, J. Artif. Intell. Res..

[9]  Hongyuan Zha,et al.  Probabilistic models for discovering e-communities , 2006, WWW '06.

[10]  Xiaowei Xu,et al.  AHSCAN: Agglomerative Hierarchical Structural Clustering Algorithm for Networks , 2009, 2009 International Conference on Advances in Social Network Analysis and Mining.

[11]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[12]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[13]  Danai Koutra,et al.  RolX: structural role extraction & mining in large graphs , 2012, KDD.

[14]  Xin Jin,et al.  Topic initiator detection on the world wide web , 2010, WWW '10.

[15]  Gianni Costa,et al.  A Bayesian Hierarchical Approach for Exploratory Analysis of Communities and Roles in Social Networks , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[16]  Min-Yen Kan,et al.  Re-examining Automatic Keyphrase Extraction Approaches in Scientific Articles , 2009, MWE@IJCNLP.

[17]  Wei Li,et al.  Mixtures of hierarchical topics with Pachinko allocation , 2007, ICML '07.

[18]  S. Borgatti,et al.  Regular equivalence: general theory , 1994 .

[19]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[20]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[21]  Hongxia Jin,et al.  Community discovery and profiling with social messages , 2012, KDD.

[22]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[23]  Pang-Ning Tan,et al.  Exploration of Link Structure and Community-Based Node Roles in Network Analysis , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[24]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[25]  Ruoming Jin,et al.  Axiomatic ranking of network role similarity , 2011, KDD.

[26]  Christos Faloutsos,et al.  Large Scale Graph Mining and Inference for Malware Detection , 2011, SDM.

[27]  Einoshin Suzuki,et al.  Discovering Community-Oriented Roles of Nodes in a Social Network , 2010, DaWak.

[28]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.