Algorithm5: A Technique for Fuzzy Similarity Clustering of Chemical Inventories

Clustering of chemical inventories on the basis of structural similarity has been shown to be useful in a number of applications related to the utilization and enhancement of those inventories. However, the widely-used Jarvis−Patrick clustering algorithm displays a number of weaknesses which make it difficult to cluster large databases in a consistent, satisfactory, and timely manner. Jarvis−Patrick clusters tend to be either too large and heterogeneous (i.e., “chained”) or too small and exclusive (i.e., under-clustered), and the algorithm requires time-consuming manual tuning. This paper describes a computer algorithm which is nondirective, in that it performs the clustering without manual tuning yet generates useful clustering results.