On the Use of Information Theory for Assessing Molecular Diversity

In a recent article published in Molecules, Lin presented a novel approach for assessing molecular diversity based on Shannon's information theory. In this method, a set of compounds is viewed as a static collection of microstates which can register information about their environment at some predetermined capacity. Diversity is directly related to the information conveyed by the population, as quantified by Shannon's classical entropy equation. Despite its intellectual appeal, this method is characterized by a strong tendency to oversample remote areas of the feature space and produce unbalanced designs. This paper demonstrates this limitation with some simple examples and provides a rationale for the failure of the method to produce results that are consistent with other traditional methodologies.