Modularity of networks

Modularity is a quality function on partitions of a network which may be used to identify highly clustered components. It is commonly used to analyse large real networks, for example in social networks and protein discovery to find communities and related proteins respectively. Given a graph G, the modularity of a partition of the vertex set measures the extent to which edge density is higher within parts than between parts, and the maximum modularity qa(G) of G (where 0 ≤ qa(G) L 1) is the maximum modularity of a partition of V(G). In essence, modularity allows us to feed in large sets of data, and output a vertex partition with an associated score. Knowledge of the maximum modularity of random graphs is important to determine the statistical significance of the maximum modularity found on a real network. This thesis establishes numerical bounds on the likely maximum modularity for random regular graphs. The modularity of a random cubic network is shown to be whp in the interval (0.66, 0.81). This result has practical applications. It establishes that a large cubic network with modularity greater than 0.81 has a statistically significant clustering structure. The evolution of the maximum modularity of Erdos-Renyi random graphs as the edge probability increases is investigated. Three different phases of the likely maximum modularity are found. For np=1+o(1) the maximum modularity is 1 + o(1) whp and for np → ∞ the maximum modularity is o(1) whp. For np = c with c G 1 a constant, functions are constructed with 0 L a(c) L b(c) L 1 and b(c) → 0 as c → ∞ such that whp the maximum modularity is bounded between these functions. Concentration of the maximum modularity about its expectation and structural properties of any optimal partition are also established. Finding the maximum modularity of graph classes helps us understand the behaviour of the modularity function. We study trees, lattices and related graphs. The maximum modularity of a large k-ary tree was shown to be near 1 by Bagrow 2012. This was extended to trees with maximal degree o(n1/5) in Montgolfier et al. 2011. This thesis further extends the result to any tree with maximal degree o(n). Indeed it is shown that the maximum modularity will tend to 1 for any graph where the product of the treewidth and the maximal degree is much less than the number of edges. This shows random planar graphs typically have modularity 1 + o(1). Lower bounds were given in Guimera et al. 2004 for the maximum modularity of complete sections of the integer lattice and for lattices with extra axis aligned edges included. The maximum modularity of any subgraph of this lattice is shown to be at least the same order of magnitude in the number of edges. This generalises to any graph which can be embedded in Euclidean space such that no edge is too long and no two vertices fall too close together.