A Bayesian Approach to Predict Solubility Parameters

Solubility is a ubiquitous phenomenon in many aspects of material science. While solubility can be determined by considering the cohesive forces in a liquid via the Hansen solubility parameters (HSP), quantitative structure–property relationship models are often used for prediction, notably due to their low computational cost. Here, gpHSP, an interpretable and versatile probabilistic approach to determining HSP, is reported. Our model is based on Gaussian processes, a Bayesian machine learning approach that provides uncertainty bounds to prediction. gpHSP achieves its flexibility by leveraging a variety of input data, such as SMILES strings, COSMOtherm simulations, and quantum chemistry calculations. gpHSP is built on experimentally determined HSP, including a general solvents set aggregated from the literature, and a polymer set experimentally characterized by this group of authors. In all sets, a high degree of agreement is obtained, surpassing well‐established machine learning methods. The general applicability of gpHSP to miscibility of organic semiconductors, drug compounds, and in general solvents is demonstrated, which can be further extended to other domains. gpHSP is a fast and accurate toolbox, which could be applied to molecular design for solution processing technologies.

[1]  Long Ye,et al.  Miscibility–Function Relations in Organic Solar Cells: Significance of Optimal Miscibility in Relation to Percolation , 2018 .

[2]  Alán Aspuru-Guzik,et al.  ChemOS: Orchestrating autonomous experimentation , 2018, Science Robotics.

[3]  Alán Aspuru-Guzik,et al.  Accelerating the discovery of materials for clean energy in the era of smart automation , 2018, Nature Reviews Materials.

[4]  D. Agbaba,et al.  Modeling of Hansen's solubility parameters of aripiprazole, ziprasidone, and their impurities: A nonparametric comparison of models for prediction of drug absorption sites , 2018 .

[5]  A. Avdeef Cocrystal Solubility Product Prediction Using an in combo Model and Simulations to Improve Design of Experiments , 2018, Pharmaceutical Research.

[6]  C. Brabec,et al.  Understanding the correlation and balance between the miscibility and optoelectronic properties of polymer–fullerene solar cells , 2017 .

[7]  Christoph J. Brabec,et al.  Introducing a New Potential Figure of Merit for Evaluating Microstructure Stability in Photovoltaic Polymer-Fullerene Blends , 2017 .

[8]  Michael J. Keiser,et al.  A simple representation of three-dimensional molecular structure , 2017, bioRxiv.

[9]  Michael C. Heiber,et al.  Small is Powerful: Recent Progress in Solution‐Processed Small Molecule Solar Cells , 2017 .

[10]  Alán Aspuru-Guzik,et al.  MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes , 2017, J. Chem. Inf. Model..

[11]  C. Brabec,et al.  Suppression of Thermally Induced Fullerene Aggregation in Polyfullerene-Based Multiacceptor Organic Solar Cells. , 2017, ACS applied materials & interfaces.

[12]  Christoph J. Brabec,et al.  Abnormal strong burn-in degradation of highly efficient polymer solar cells caused by spinodal donor-acceptor demixing , 2017, Nature Communications.

[13]  Johannes Textor,et al.  Complete Graphical Characterization and Construction of Adjustment Sets in Markov Equivalence Classes of Ancestral Graphs , 2016, J. Mach. Learn. Res..

[14]  Matti Hoch,et al.  Advanced Drug Delivery Reviews , 2017 .

[15]  G. Járvás,et al.  Combined Computational Approach Based on Density Functional Theory and Artificial Neural Networks for Predicting The Solubility Parameters of Fullerenes. , 2016, Journal of Physical Chemistry B.

[16]  S. Murdan,et al.  Application of Hansen Solubility Parameters to predict drug-nail interactions, which can assist the design of nail medicines. , 2016, European journal of pharmaceutics and biopharmaceutics : official journal of Arbeitsgemeinschaft fur Pharmazeutische Verfahrenstechnik e.V.

[17]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[18]  R. Dauskardt,et al.  Molecular-Scale Understanding of Cohesion and Fracture in P3HT:Fullerene Blends. , 2015, ACS applied materials & interfaces.

[19]  C. Brabec,et al.  Classification of additives for organic photovoltaic devices. , 2015, Chemphyschem : a European journal of chemical physics and physical chemistry.

[20]  John B. O. Mitchell,et al.  A review of methods for the calculation of solution free energies and the modelling of systems in solution. , 2015, Physical chemistry chemical physics : PCCP.

[21]  R. J. Kline,et al.  In Situ Characterization of Polymer–Fullerene Bilayer Stability , 2015 .

[22]  J. Brédas,et al.  Influence of Molecular Shape on Solid-State Packing in Disordered PC61BM and PC71BM Fullerenes. , 2014, The journal of physical chemistry letters.

[23]  Johan Ulander,et al.  Computational Prediction of Drug Solubility in Fasted Simulated and Aspirated Human Intestinal Fluid , 2014, Pharmaceutical Research.

[24]  Christoph J. Brabec,et al.  Solubility Based Identification of Green Solvents for Small Molecule Organic Solar Cells , 2014 .

[25]  L. Servant,et al.  Guiding the Selection of Processing Additives for Increasing the Efficiency of Bulk Heterojunction Polymeric Solar Cells , 2014 .

[26]  Tong Zhang,et al.  Learning Nonlinear Functions Using Regularized Greedy Forest , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  E. Lucas,et al.  Determining hildebrand solubility parameter by ultraviolet spectroscopy and microcalorimetry , 2013 .

[28]  Xiaojing Zhou,et al.  The role of miscibility in polymer:fullerene nanoparticulate organic photovoltaic devices , 2013 .

[29]  Daniel T. W. Toolan,et al.  Determination of Solvent–Polymer and Polymer–Polymer Flory–Huggins Interaction Parameters for Poly(3-hexylthiophene) via Solvent Vapor Swelling , 2013 .

[30]  J. Coleman,et al.  Generalizing solubility parameter theory to apply to one‐ and two‐dimensional solutes and to incorporate dipolar interactions , 2013 .

[31]  James J. P. Stewart,et al.  Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and re-optimization of parameters , 2012, Journal of Molecular Modeling.

[32]  Thuc‐Quyen Nguyen,et al.  Molecular solubility and hansen solubility parameters for the analysis of phase separation in bulk heterojunctions , 2012 .

[33]  Steven Abbott,et al.  Determination of the P3HT:PCBM solubility parameters via a binary solvent gradient method: Impact of solubility on the photovoltaic performance , 2012 .

[34]  隆弘 梅津 Hansen Solubility Parameters による化学物質保護衣の選定 , 2012 .

[35]  Frank Neese,et al.  The ORCA program system , 2012 .

[36]  G. Járvás,et al.  Estimation of Hansen solubility parameters using multivariate nonlinear QSPR modeling with COSMO scr , 2011 .

[37]  Steven Abbott,et al.  Determination of Solubility Parameters for Organic Semiconductor Formulations , 2011 .

[38]  A. Hexemer,et al.  Polymer Crystallization of Partially Miscible Polythiophene/Fullerene Mixtures Controls Morphology , 2011 .

[39]  S. Velaga,et al.  Hansen solubility parameter as a tool to predict cocrystal formation. , 2011, International journal of pharmaceutics.

[40]  Thuc‐Quyen Nguyen,et al.  A Systematic Approach to Solvent Selection Based on Cohesive Energy Densities in a Molecular Bulk Heterojunction System , 2011 .

[41]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[42]  Gang Li,et al.  For the Bright Future—Bulk Heterojunction Polymer Solar Cells with Power Conversion Efficiency of 7.4% , 2010, Advanced materials.

[43]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[44]  R. Segalman,et al.  Block Copolymers for Organic Optoelectronics , 2009 .

[45]  J. Coleman,et al.  Multicomponent solubility parameters for single-walled carbon nanotube-solvent mixtures. , 2009, ACS nano.

[46]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[47]  Jie Xu,et al.  Application of QSPR to Binary Polymer/Solvent Mixtures: Prediction of Flory-Huggins Parameters , 2008 .

[48]  R. Glen,et al.  Solubility Challenge: Can You Predict Solubilities of 32 Molecules Using a Database of 100 Reliable Measurements? , 2008, J. Chem. Inf. Model..

[49]  C. Hansen Solubility Parameters — An Introduction , 2007 .

[50]  C. Hansen,et al.  Hansen Solubility Parameters : A User's Handbook, Second Edition , 2007 .

[51]  T. Frauenheim,et al.  DFTB+, a sparse matrix-based implementation of the DFTB method. , 2007, The journal of physical chemistry. A.

[52]  Jan W. Gooch,et al.  Encyclopedic dictionary of polymers , 2007 .

[53]  Stefan Grimme,et al.  Semiempirical GGA‐type density functional constructed with a long‐range dispersion correction , 2006, J. Comput. Chem..

[54]  Y. A. Liu,et al.  Sigma-Profile Database for Using COSMO-Based Thermodynamic Methods , 2006 .

[55]  Yu Zhu,et al.  Macromolecular Chemistry and Physics , 2006 .

[56]  F. Weigend,et al.  Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. , 2005, Physical chemistry chemical physics : PCCP.

[57]  A. Maiti,et al.  Nanotube–polymer composites: insights from Flory–Huggins theory and mesoscale simulations , 2005 .

[58]  P. Cummings,et al.  Fluid phase equilibria , 2005 .

[59]  The basic COSMO-RS , 2005 .

[60]  P. Ruelle,et al.  Significance of Partial and Total Cohesion Parameters of Pharmaceutical Solids Determined from Dissolution Calorimetric Measurements , 1991, Pharmaceutical Research.

[61]  P. Augustijns,et al.  Determination of partial solubility parameters of five benzodiazepines in individual solvents. , 2001, International journal of pharmaceutics.

[62]  The modified extended Hansen method to determine partial solubility parameters of drugs containing a single hydrogen bonding group and their sodium derivatives: benzoic acid/Na and ibuprofen/Na. , 2000, International journal of pharmaceutics.

[63]  P. Avontuur,et al.  Solubility parameter and oral absorption. , 1999, European journal of pharmaceutics and biopharmaceutics : official journal of Arbeitsgemeinschaft fur Pharmazeutische Verfahrenstechnik e.V.

[64]  Peter York,et al.  The use of solubility parameters in pharmaceutical dosage form design , 1997 .

[65]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings , 1997 .

[66]  Ernst Anders,et al.  Optimization and application of lithium parameters for PM3 , 1993, J. Comput. Chem..

[67]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[68]  A. Klamt,et al.  COSMO : a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient , 1993 .

[69]  W. Goddard,et al.  UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations , 1992 .

[70]  A. Becke,et al.  Density-functional exchange-energy approximation with correct asymptotic behavior. , 1988, Physical review. A, General physics.

[71]  J. Perdew,et al.  Density-functional approximation for the correlation energy of the inhomogeneous electron gas. , 1986, Physical review. B, Condensed matter.

[72]  J. W.,et al.  The Journal of Physical Chemistry , 1900, Nature.

[73]  M. Muir Physical Chemistry , 1888, Nature.