Structure Analysis of Email Networks by Information-Theoretic Clustering

In the real world, many systems can be represented as a network, in which the nodes denote the objects of interest and the edges describe the relations between them, such as telecommunication networks, power grid networks, and email communication networks These complex networks have been revealed to possess many common statistical properties such as scale-free nature and small-world property In addition, modularity or community structure is another important characteristic of complex networks Identifying modular structure can help us understand the function of networks In this paper, we introduce a method based on information-theoretic clustering for finding communities/modules in complex networks This method is robust to the feature representation of networks Moreover, unlike most existing algorithms, this method does not need to search the number of communities in a network and can determine it automatically We apply this method to several well-studied networks including a large-scale email communication network and the computational results demonstrate its effectiveness.

[1]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[2]  W. Bialek,et al.  Information-based clustering. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[4]  Pietro Liò,et al.  Towards real-time community detection in large networks. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Ciro Cattuto,et al.  Proceedings of the 20th ACM conference on Hypertext and hypermedia , 2009 .

[6]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Mika Gustafsson,et al.  Comparison and validation of community structures in complex networks , 2006 .

[8]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[11]  S Boccaletti,et al.  Identification of network modules by optimization of ratio association. , 2006, Chaos.

[12]  P. Oscar Boykin,et al.  Collaborative Spam Filtering Using E-Mail Networks , 2006, Computer.

[13]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[14]  S. Bornholdt,et al.  Scale-free topology of e-mail networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Stephanie Forrest,et al.  Email networks and the spread of computer viruses. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[18]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[19]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[20]  P. Oscar Boykin,et al.  Leveraging social networks to fight spam , 2005, Computer.

[21]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Alex Arenas,et al.  The real communication network behind the formal chart: Community structure in organizations , 2006 .

[23]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[25]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[26]  Siëlle Gramser Fake pottery buries theory of early start for Christianity , 2005, Nature.

[27]  A Díaz-Guilera,et al.  Self-similar community structure in a network of human interactions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.