Clustering Validity Indices Evaluation with Regard to Semantic Homogeneity

Clustering validity indices are methods for examining and assessing the quality of data clustering results. Various studies provide a thorough evaluation of their performance using both synthetic and real-world datasets. In this work, we describe various approaches to the topic of evaluation of a clustering scheme. Moreover, a new solution to a problem of selecting an appropriate clustering validity index is presented. The approach is applied to a problem of selecting a suitable clustering validity index for a real-world task of clustering biomedical articles using the MeSH ontology.