Evaluating the community partition quality of a network with a genetic programming approach

Although the problem of partition quality evaluation is well-known in literature, most of the traditional approaches involve the application of a model built upon a theoretical foundation and then applied to real data. Conversely, this work presents a novel approach: it extracts a model from a network which partition in ground-truth communities is known, so that it can be used in other contexts. The extracted model takes the form of a validation function, which is a function that assigns a score to a specific partition of a network: the closer the partition is to the optimal, the better the score. In order to obtain a suitable validation function, we make use of genetic programming, an application of genetic algorithms where the individuals of a population are computer programs. In this paper we present a computationally feasible methodology to set up the genetic programming run, and show our design choices for the terminal set, function set, fitness function and control parameters.

[1]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  S. Luke,et al.  A Comparison of Crossover and Mutation in Genetic Programming , 1997 .

[3]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[4]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[5]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[6]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[7]  Riccardo Poli,et al.  Introduction to genetic programming , 2009, GECCO '09.

[8]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Claudio Castellano,et al.  Community Structure in Graphs , 2007, Encyclopedia of Complexity and Systems Science.

[10]  Hui Xiong,et al.  Adapting the right measures for K-means clustering , 2009, KDD.

[11]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[12]  V. Carchiolo,et al.  Extending the definition of modularity to directed graphs with overlapping communities , 2008, 0801.1647.

[13]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[14]  Vincenza Carchiolo,et al.  Search for overlapped communities by parallel genetic algorithms , 2009, ArXiv.