NORMALIZED MUTUAL INFORMATION EXAGGERATES COMMUNITY DETECTION PERFORMANCE

We present a critical evaluation of normalized mutual information (NMI) as an evaluation metric for community detection (CD). NMI exaggerates the leximin method’s performance on weak communities: Does leximin, in finding the trivial singletons clustering, truly outperform eight other CD methods? Three NMI improvements from the literature are AMI, rrNMI, and cNMI. We show equivalences under relevant randomness models, and for CD evaluation, we advise one-sided AMI under Mall (all partitions of n nodes). This work seeks 1) to start a conversation on robust measurements, and 2) to advocate evaluations which do not give “free lunch”.