论文信息 - NORMALIZED MUTUAL INFORMATION EXAGGERATES COMMUNITY DETECTION PERFORMANCE

NORMALIZED MUTUAL INFORMATION EXAGGERATES COMMUNITY DETECTION PERFORMANCE

We present a critical evaluation of normalized mutual information (NMI) as an evaluation metric for community detection (CD). NMI exaggerates the leximin method’s performance on weak communities: Does leximin, in finding the trivial singletons clustering, truly outperform eight other CD methods? Three NMI improvements from the literature are AMI, rrNMI, and cNMI. We show equivalences under relevant randomness models, and for CD evaluation, we advise one-sided AMI under Mall (all partitions of n nodes). This work seeks 1) to start a conversation on robust measurements, and 2) to advocate evaluations which do not give “free lunch”.

Arya D. McCarthy | David W. Matula

[1] Zhao Yang,et al. A Comparative Analysis of Community Detection Algorithms on Artificial Networks , 2016, Scientific Reports.

[2] Junfeng Hu,et al. On the relationship between Gaussian stochastic blockmodels and label propagation algorithms , 2014, Journal of Statistical Mechanics: Theory and Experiment.

[3] Yong-Yeol Ahn,et al. The Impact of Random Models on Clustering Similarity , 2017, bioRxiv.

[4] Leon Danon,et al. Comparing community structure identification , 2005, cond-mat/0505245.

[5] David W. Matula,et al. Extensions of maximum concurrent flow to identify hierarchical community structure and hubs in networks , 2008 .

[6] Farhad Shahrokhi,et al. The maximum concurrent flow problem , 1990, JACM.

[7] Leto Peel,et al. The ground truth about metadata and community detection in networks , 2016, Science Advances.

[8] Christine Nardini,et al. A corrected normalized mutual information for performance evaluation of community detection , 2016 .

[9] Cristopher Moore,et al. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10] James Bailey,et al. Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..