QSPR Approach to Predict Nonadditive Properties of Mixtures. Application to Bubble Point Temperatures of Binary Mixtures of Liquids

This paper is devoted to the development of methodology for QSPR modeling of mixtures and its application to vapor/liquid equilibrium diagrams for bubble point temperatures of binary liquid mixtures. Two types of special mixture descriptors based on SiRMS and ISIDA approaches were developed. SiRMS‐based fragment descriptors involve atoms belonging to both components of the mixture, whereas the ISIDA fragments belong only to one of these components. The models were built on the data set containing the phase diagrams for 167 mixtures represented by different combinations of 67 pure liquids. Consensus models were developed using nonlinear Support Vector Machine (SVM), Associative Neural Networks (ASNN), and Random Forest (RF) approaches. For SVM and ASNN calculations, the ISIDA fragment descriptors were used, whereas Simplex descriptors were employed in RF models. The models have been validated using three different protocols: “Points out”, “Mixtures out” and “Compounds out”, based on the specific rules to form training/test sets in each fold of cross‐validation. A final validation of the models has been performed on an additional set of 94 mixtures represented by combinations of novel 34 compounds and modeling set chemicals with each other. The root mean squared error of predictions for new mixtures of already known liquids does not exceed 5.7 K, which outperforms COSMO‐RS models. Developed QSAR methodology can be applied to the modeling of any nonadditive property of binary mixtures (antiviral activities, drug formulation, etc.)

[1]  Eugene N Muratov,et al.  Existing and Developing Approaches for QSAR Analysis of Mixtures , 2012, Molecular informatics.

[2]  Igor I. Baskin,et al.  Prediction of the preferable mechanism of nucleophilic substitution at saturated carbon atom and prognosis of SN1 rate constants by means of QSPR , 2011 .

[3]  A. Varnek,et al.  Quantitative Structure–Property Relationship (QSPR) Modeling of Normal Boiling Point Temperature and Composition of Binary Azeotropes , 2011 .

[4]  V. A. Palyulin,et al.  Prediction of rate constants of SN2 reactions by the multicomponent QSPR method , 2011 .

[5]  Mati Karelson,et al.  Application of the QSPR approach to the boiling points of azeotropes. , 2011, The journal of physical chemistry. A.

[6]  Michaela Schmidtke,et al.  QSAR analysis of [(biphenyloxy)propyl]isoxazoles: agents against coxsackievirus B3. , 2011, Future medicinal chemistry.

[7]  S. C. Rogers,et al.  Characterization of Mixtures. Part 2: QSPR Models for Prediction of Excess Molar Volume and Liquid Density Using Neural Networks , 2010, Molecular Informatics.

[8]  Eugene N Muratov,et al.  Per aspera ad astra: application of Simplex QSAR approach in antiviral research. , 2010, Future medicinal chemistry.

[9]  A. Klamt,et al.  COSMO-RS: an alternative to simulation for calculating thermodynamic properties of liquid mixtures. , 2010, Annual review of chemical and biomolecular engineering.

[10]  Victor Kuzmin,et al.  Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity , 2009, J. Chem. Inf. Model..

[11]  Qunsheng Li,et al.  UNIFAC Model for Ionic Liquids , 2009 .

[12]  S. C. Rogers,et al.  Characterization of Mixtures Part 1: Prediction of Infinite‐Dilution Activity Coefficients Using Neural Network‐Based QSPR Models , 2008 .

[13]  Victor Kuzmin,et al.  Hierarchical QSAR technology based on the Simplex representation of molecular structure , 2008, J. Comput. Aided Mol. Des..

[14]  J. Gmehling,et al.  Performance of COSMO-RS with Sigma Profiles from Different Model Chemistries , 2007 .

[15]  R. L. Robinson,et al.  QSPR generalization of activity coefficient models for predicting vapor–liquid equilibrium behavior , 2007 .

[16]  E. Muratov,et al.  Quantitative structure-activity relationship studies of [(biphenyloxy)propyl]isoxazole derivatives. Inhibitors of human rhinovirus 2 replication. , 2007, Journal of medicinal chemistry.

[17]  Igor I. Baskin,et al.  “Bimolecular” QSPR: Estimation of the solvation free energy of organic molecules in different solvents , 2007 .

[18]  David J. Livingstone,et al.  Application of QSPR to Mixtures , 2006, J. Chem. Inf. Model..

[19]  Y. A. Liu,et al.  Sigma-Profile Database for Using COSMO-Based Thermodynamic Methods , 2006 .

[20]  Igor V. Tetko,et al.  Associative Neural Network , 2002, Neural Processing Letters.

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[23]  Chul Soo Lee,et al.  Development and Current Status of the Korea Thermophysical Properties Databank (KDB) , 2001 .

[24]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[25]  R. Bölts,et al.  Azeotropic Data for Binary and Ternary Systems at Moderate Pressures , 1996 .

[26]  Igor V. Tetko,et al.  Neural network studies, 1. Comparison of overfitting and overtraining , 1995, J. Chem. Inf. Comput. Sci..

[27]  A. Klamt Conductor-like Screening Model for Real Solvents: A New Approach to the Quantitative Calculation of Solvation Phenomena , 1995 .

[28]  K. Kojima,et al.  Vapor-liquid equilibria of 2,3-dimethylbutane + methanol or ethanol at 101.3 kPa , 1992 .

[29]  J. Gmehling,et al.  Determination of new asog parameters , 1990 .

[30]  J. Gmehling,et al.  A modified UNIFAC model. 1. Prediction of VLE, hE, and .gamma..infin. , 1987 .

[31]  J. Wisniak,et al.  Vapor-liquid equilibria at 760 mmHg in the system methanol-2-propanol-propyl bromide and its binaries , 1985 .

[32]  J. Gmehling,et al.  Eine Übersicht zuer Berechnung von Phasengleichgewichte mit Hilfe der UNIFAC-Methode:A SURVEY OF THE CALCULATION OF PHASE-EQUILIBRIA WITH THE AID OF THE UNIFAC-METHOD , 1980 .

[33]  J. M. Prausnitz,et al.  Application of the UNIQUAC Equation to Calculation of Multicomponent Phase Equilibria. 2. Liquid-Liquid Equilibria , 1978 .

[34]  Aage Fredenslund,et al.  Computerized Design of Multicomponent Distillation Columns Using the UNIFAC Group Contribution Method for Calculation of Activity Coefficients , 1977 .

[35]  Z. L. Taylor,et al.  Vapor-liquid equilibriums of binary systems containing selected hydrocarbons with perfluorobenzene , 1973 .

[36]  K. Miller,et al.  Vapor-liquid equilibrium for binary systems 2-butanone with 2-butanol, 1-pentanol, and isoamyl alcohol , 1972 .

[37]  C. E. Kirby,et al.  Vapor-liquid equilibriums: 2,3-dimethylbutane-methanol and 2,3-dimethylbutane-methanol-chloroform systems , 1970 .

[38]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[39]  N. Alpert,et al.  Vapor-Liquid Equilibria in Binary Systems Systems Involving cis- or trans-Dichloroethylene and an Alcohol , 1951 .

[40]  E. B. Wilson,et al.  The Distribution of Chi-Square. , 1931, Proceedings of the National Academy of Sciences of the United States of America.