Fuzzy Divisive Hierarchical Clustering of Solvents According to Their Experimentally and Theoretically Predicted Descriptors

The present study describes a simple procedure to separate into patterns of similarity a large group of solvents, 259 in total, presented by 15 specific descriptors (experimentally found and theoretically predicted physicochemical parameters). Solvent data is usually characterized by its high variability, different molecular symmetry, and spatial orientation. Methods of chemometrics can usefully be used to extract and explore accurately the information contained in such data. In this order, advanced fuzzy divisive hierarchical-clustering methods were efficiently applied in the present study of a large group of solvents using specific descriptors. The fuzzy divisive hierarchical associative-clustering algorithm provides not only a fuzzy partition of the solvents investigated, but also a fuzzy partition of descriptors considered. In this way, it is possible to identify the most specific descriptors (in terms of higher, smallest, or intermediate values) to each fuzzy partition (group) of solvents. Additionally, the partitioning performed could be interpreted with respect to the molecular symmetry. The chemometric approach used for this goal is fuzzy c-means method being a semi-supervised clustering procedure. The advantage of such a clustering process is the opportunity to achieve separation of the solvents into similarity patterns with a certain degree of membership of each solvent to a certain pattern, as well as to consider possible membership of the same object (solvent) in another cluster. Partitioning based on a hybrid approach of the theoretical molecular descriptors and experimentally obtained ones permits a more straightforward separation into groups of similarity and acceptable interpretation. It was shown that an important link between objects’ groups of similarity and similarity groups of variables is achieved. Ten classes of solvents are interpreted depending on their specific descriptors, as one of the classes includes a single object and could be interpreted as an outlier. Setting the results of this research into broader perspective, it has been shown that the fuzzy clustering approach provides a useful tool for partitioning by the variables related to the main physicochemical properties of the solvents. It gets possible to offer a simple guide for solvents recognition based on theoretically calculated or experimentally found descriptors related to the physicochemical properties of the solvents.

[1]  M. Tobiszewski,et al.  Pre-selection and assessment of green organic solvents by clustering chemometric tools. , 2018, Ecotoxicology and environmental safety.

[2]  Aurélien Planchat,et al.  A database of dispersion-induction DI, electrostatic ES, and hydrogen bonding α1 and β1 solvent parameters and some applications to the multiparameter correlation analysis of solvent effects. , 2015, The journal of physical chemistry. B.

[3]  Roberto Todeschini,et al.  Defining a novel k-nearest neighbours approach to assess the applicability domain of a QSAR model for reliable predictions , 2013, Journal of Cheminformatics.

[4]  Horia F. Pop,et al.  The Fuzzy Hierarchical Cross-Clustering Algorithm. Improvements and Comparative Study , 1997, J. Chem. Inf. Comput. Sci..

[5]  Horia F. Pop,et al.  A Fuzzy Cross-Classification of the Chemical Elements, Based on Their Physical, Chemical, and Structural Features , 1996, J. Chem. Inf. Comput. Sci..

[6]  E. Lesellier Σpider diagram: a universal and versatile approach for system comparison and classification: application to solvent properties. , 2015, Journal of chromatography. A.

[7]  Alan J. Parker,et al.  Protic-dipolar aprotic solvent effects on rates of bimolecular reactions , 1969 .

[8]  Vesna Rastija,et al.  Deep Eutectic Solvents as Convenient Media for Synthesis of Novel Coumarinyl Schiff Bases and Their QSAR Studies , 2017, Molecules.

[9]  Mark D. Driver,et al.  Solvent similarity index. , 2020, Physical chemistry chemical physics : PCCP.

[10]  C. Sârbu,et al.  Fuzzy characterization and classification of solvents according to their polarity and selectivity. A comparison with the Snyder approach , 2020 .

[11]  Michel Chanon,et al.  Approach to a general classification of solvents using a multivariate statistical treatment of quantitative solvent parameters , 1985 .

[12]  Yaroslava Pushkarova,et al.  The classification of solvents based on solvatochromic characteristics: the choice of optimal parameters for artificial neural networks , 2012 .

[13]  A. Moț,et al.  Ecosystem discrimination and fingerprinting of Romanian propolis by hierarchical fuzzy clustering and image analysis of TLC patterns. , 2011, Talanta.

[14]  Horia F. Pop,et al.  Fuzzy Soft‐Computing Methods and Their Applications in Chemistry , 2004 .

[15]  Horia F. Pop,et al.  A study of Roman pottery (terra sigillata) using hierarchical fuzzy clustering , 1995 .

[16]  Andrew R. Johnson,et al.  Chromatographic selectivity triangles. , 2011, Journal of chromatography. A.

[17]  Yilin Wang,et al.  A Unified Treatment of Solvent Properties , 1999, J. Chem. Inf. Comput. Sci..

[18]  Mati Karelson,et al.  The classification of solvents by combining classical QSPR methodology with principal component analysis. , 2005, The journal of physical chemistry. A.

[19]  Dumitru Dumitrescu,et al.  Fuzzy hierarchical cross-classification of Greek muds , 1995, J. Chem. Inf. Comput. Sci..

[20]  K. Zehl,et al.  Fuzzy divisive hierarchical clustering of soil data using Gustafson–Kessel algorithm , 2007 .

[21]  Jean-Claude Bradley,et al.  Predicting Abraham model solvent coefficients , 2015, Chemistry Central Journal.

[22]  C. Poole,et al.  Solvent Classification for Chromatography and Extraction , 2012, JPC – Journal of Planar Chromatography – Modern TLC.