Characterizing and Comparing External Measures for the Assessment of Cluster Analysis and Community Detection

In the context of cluster analysis and graph partitioning, many external evaluation measures have been proposed in the literature to compare two partitions of the same set. This makes the task of selecting the most appropriate measure for a given situation a challenge for the end user. However, this issue is overlooked in the literature. Researchers tend to follow tradition and use the standard measures of their field, although they often became standard only because previous researchers started consistently using them. In this work, we propose a new empirical evaluation framework to solve this issue, and help the end user selecting an appropriate measure for their application. For a collection of candidate measures, it first consists in describing their behavior by computing them for a generated dataset of partitions, obtained by applying a set of predefined parametric partition transformations. Second, our framework performs a regression analysis to characterize the measures in terms of how they are affected by these parameters and transformations. This allows both describing and comparing the measures. Our approach is not tied to any specific measure or application, so it can be applied to any situation. We illustrate its relevance by applying it to a selection of standard measures, and show how it can be put in practice through two concrete use cases.

[1]  Byron Dom,et al.  An Information-Theoretic External Cluster-Validity Measure , 2002, UAI.

[2]  E C Alexopoulos,et al.  Introduction to multivariate regression analysis. , 2010, Hippokratia.

[3]  G. W. Milligan,et al.  A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis. , 1986, Multivariate behavioral research.

[4]  Evaluating accuracy of community detection using the relative normalized mutual information , 2021 .

[5]  George T. Cantwell,et al.  Improved mutual information measure for classification and community detection , 2019, ArXiv.

[6]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[7]  M. Hardy Regression with dummy variables , 1993 .

[8]  Pasi Fränti,et al.  Centroid index: Cluster level similarity measure , 2014, Pattern Recognit..

[9]  S. Dongen Performance criteria for graph clustering and Markov cluster experiments , 2000 .

[10]  Tarald O. Kvålseth,et al.  On Normalized Mutual Information: Measure Derivations and Properties , 2017, Entropy.

[11]  Silke Wagner,et al.  Comparing Clusterings - An Overview , 2007 .

[12]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[13]  Kim F. Nimon,et al.  Interpreting Multiple Linear Regression: A Guidebook of Variable Importance , 2012 .

[14]  J. Neter,et al.  Applied Linear Statistical Models (3rd ed.). , 1992 .

[15]  Hui Xiong,et al.  Information-Theoretic Distance Measures for Clustering Validation: Generalization and Normalization , 2009, IEEE Transactions on Knowledge and Data Engineering.

[16]  Christine Nardini,et al.  A corrected normalized mutual information for performance evaluation of community detection , 2016 .

[17]  James Bailey,et al.  Information theoretic measures for clusterings comparison: is a correction for chance necessary? , 2009, ICML '09.

[18]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[19]  Bahjat F. Qaqish,et al.  Suboptimal Comparison of Partitions , 2020, J. Classif..

[20]  Matthijs J. Warrens,et al.  Understanding information theoretic measures for comparing clusterings , 2018, Behaviormetrika.

[21]  Boris Mirkin,et al.  Mathematical Classification and Clustering: From How to What and Why , 1998 .

[22]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[23]  Rosa Figueiredo,et al.  Multiple Partitioning of Multiplex Signed Networks , 2020 .

[24]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[25]  Ricardo J. G. B. Campello,et al.  Comparing hard and overlapping clusterings , 2015, J. Mach. Learn. Res..

[26]  Hui Xiong,et al.  Adapting the right measures for K-means clustering , 2009, KDD.

[27]  P. C. Saxena,et al.  The effect of cluster size, dimensionality, and number of clusters on recovery of true cluster structure through Chernoff-type faces , 1991 .

[28]  Julio Gonzalo,et al.  The SemEval-2007 WePS Evaluation: Establishing a benchmark for the Web People Search Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[29]  Vincent Labatut,et al.  Generalised measures for the evaluation of community detection methods , 2013, Int. J. Soc. Netw. Min..

[30]  Graham K. Rand,et al.  Quantitative Applications in the Social Sciences , 1983 .

[31]  Clara Pizzuti,et al.  FOR CLOSENESS : ADJUSTING NORMALIZED MUTUAL INFORMATION MEASURE FOR CLUSTERING COMPARISON , 2016 .

[32]  Ivor W. Tsang,et al.  A Split-Merge Framework for Comparing Clusterings , 2012, ICML.

[33]  David M. W. Powers,et al.  Characterization and evaluation of similarity measures for pairs of clusterings , 2009, Knowledge and Information Systems.

[34]  Xin Liu,et al.  Evaluation of Community Detection Methods , 2018, IEEE Transactions on Knowledge and Data Engineering.

[35]  Yong-Yeol Ahn,et al.  The Impact of Random Models on Clustering Similarity , 2017, bioRxiv.

[36]  L. A. Goodman,et al.  Measures of Association for Cross Classifications III: Approximate Sampling Theory , 1963 .

[37]  Hugo Steinhaus,et al.  On a certain distance of sets and the corresponding distance of functions , 1958 .

[38]  M. Cugmas,et al.  On comparing partitions , 2015 .

[39]  Isabelle Guyon,et al.  A Stability Based Method for Discovering Structure in Clustered Data , 2001, Pacific Symposium on Biocomputing.

[40]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[41]  James Bailey,et al.  Standardized Mutual Information for Clustering Comparisons: One Step Further in Adjustment for Chance , 2014, ICML.

[42]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[43]  Ivan G. Costa,et al.  A Comparison of External Clustering Evaluation Indices in the Context of Imbalanced Data Sets , 2012, 2012 Brazilian Symposium on Neural Networks.

[44]  Ahmed Albatineh,et al.  Correcting Jaccard and other similarity indices for chance agreement in cluster analysis , 2011, Adv. Data Anal. Classif..

[45]  Zhong-Yuan Zhang,et al.  Comment on "Improved mutual information measure for clustering, classification, and community detection" , 2020, ArXiv.

[46]  Yong-Yeol Ahn,et al.  Element-centric clustering comparison unifies overlaps and hierarchy , 2017, Scientific Reports.

[47]  R. Dodhia A Review of Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.) , 2005 .

[48]  Ahmed Albatineh,et al.  On Similarity Indices and Correction for Chance Agreement , 2006, J. Classif..

[49]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[50]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[51]  Yong-Yeol Ahn,et al.  On comparing clusterings: an element-centric framework unifies overlaps and hierarchy , 2017, ArXiv.

[52]  Ying Gao,et al.  Generalized Pair-Counting Similarity Measures for Clustering and Cluster Ensembles , 2017, IEEE Access.

[53]  Pasi Fränti,et al.  Set Matching Measures for External Cluster Validity , 2016, IEEE Transactions on Knowledge and Data Engineering.

[54]  C. Mallows,et al.  A Method for Comparing Two Hierarchical Clusterings , 1983 .

[55]  James Bailey,et al.  Adjusting for Chance Clustering Comparison Measures , 2015, J. Mach. Learn. Res..

[56]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[57]  Ricardo J. G. B. Campello,et al.  Communities validity: methodical evaluation of community mining algorithms , 2013, Social Network Analysis and Mining.

[58]  Ari Rappoport,et al.  The NVI Clustering Evaluation Measure , 2009, CoNLL.

[59]  Marina Meila,et al.  Comparing Clusterings by the Variation of Information , 2003, COLT.

[60]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[61]  John P. Oakley,et al.  The Effect of Cluster Size , 1995 .

[62]  Bruce Thompson,et al.  Practical Guide for Reporting Effect Size in Quantitative Research in the Journal of Counseling & Development , 2004 .

[63]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[64]  Julian J. Faraway,et al.  Practical Regression and Anova using R , 2002 .

[65]  James M. LeBreton,et al.  History and Use of Relative Importance Indices in Organizational Research , 2004 .

[66]  Tarald O. Kvålseth,et al.  Entropy and Correlation: Some Comments , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[67]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[68]  Matthijs J. Warrens,et al.  Understanding partition comparison indices based on counting object pairs , 2019, ArXiv.

[69]  Marina Meila,et al.  Criteria for Comparing Clusterings , 2015 .

[70]  Pasi Fränti,et al.  K-means properties on six clustering benchmark datasets , 2018, Applied Intelligence.