SAMPL3: blinded prediction of host–guest binding affinities, hydration free energies, and trypsin inhibitors

This special issue of the Journal of Computer-Aided Molecular Design is the culmination of the 4th Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) challenge and workshop. SAMPL3 had three datasets: blinded small-molecule hydration energies, provided by Peter Guthrie [1]; two novel host–guest systems, including eleven unpublished binding energies, provided by Adam Urbach and Lyle Issacs [2], and a monumental dataset including structural and affinity data for 500 fragments against Trypsin, provided by Tom Peat [3]. The SAMPL3 workshop saw over 40 attendees while the SAMPL3 challenge received 103 submissions from 23 participating groups using a variety of methods including: discrete and dynamic conformational sampling; implicit, semi-implicit and explicit water models; and myriad of force-fields and charge models. Gilson [2] and Geballe [1] have provided summaries of the host–guest challenge and solvation-energy challenge respectively. As with prior SAMPL challenges, many different approaches generated high-quality predictions yet no single technique distinguished itself significantly. Nevertheless, many important insights into the strengths and limitations of computational and experimental methods were developed through SAMPL. SAMPL3 was the first blinded challenge to include prediction of host–guest binding affinities. Host–guest binding affinities provided an outstanding blind challenge, as they are simple enough to encourage participants to recognize, explore and address assumptions and errors. Most participants had great difficulty modeling the aspartyl-protease-like formal charges found in the host molecules. This is concerning, for while ionization sites occur commonly in protein–ligand systems, rarely are they addressed at the level of detail participants found necessary for this host–guest system. More host–guest examples should be included in future SAMPL challenges as their streamlined nature highlights assumptions that can be too easily overlooked. SAMPL is one of several projects that provide blinded or prospective experimental challenges to the computational community [4–6]. These projects are intended to serve as both a guidepost for computational progress and a meeting ground for experimental and computational scientist. However, it has been a struggle to generate mutual interest between computational and experimental scientists. This year, SAMPL had a breakthrough in the form of Lyle Issacs (host–guest affinities) and Tom Peat (trypsin structures and affinities). Lyle and Tom are experimentalists who provided data to SAMPL3, attended the SAMPL workshop, and provided insights and challenges to the computational scientists. We hope they are the first of many experimentalists to join SAMPL and challenge theorists with their data. Unfortunately, no experimental collaborator has emerged to provide prospective hydration free energies, which have been part of each of the four SAMPL evaluations. Hydration energies are the most basic measure of the solvation of molecules in water. Aqueous solvation plays a critical role in most biophysical and biochemical phenomenon, and our ability to accurately predict biophysical processes is limited by our ability to accurately calculate solvation interactions. Hydration energies represent one of the simplest experiments that allow us to evaluate these predictions. As a consequence, comparison to hydration energies is a fundamental tool for evaluating force fields and electrostatic models (for example see [7]). A. G. Skillman (&) OpenEye Scientific Software, Santa Fe, NM, USA e-mail: skillman@eyesopen.com

[1]  R. Brand,et al.  Testing for the presence of positive-outcome bias in peer review: a randomized controlled trial. , 2010, Archives of internal medicine.

[2]  A. Sanabria,et al.  Randomized controlled trial. , 2005, World journal of surgery.

[3]  Yue Shi,et al.  Multipole electrostatics in hydration free energy calculations , 2011, J. Comput. Chem..

[4]  J. Nielsen,et al.  The pKa Cooperative: A collaborative effort to advance structure‐based calculations of pKa values and electrostatic effects in proteins , 2011, Proteins.

[5]  Claire S. Adjiman,et al.  Towards crystal structure prediction of complex organic compounds – a report on the fifth blind test , 2011, Acta crystallographica. Section B, Structural science.

[6]  Richard A. Brand,et al.  Outcome-Blinded Peer Review—Reply , 2011 .

[7]  P. Bach,et al.  Outcome-blinded peer review. , 2011, Archives of Internal Medicine.

[8]  Thomas S. Peat,et al.  The DINGO dataset: a comprehensive set of data for the SAMPL challenge , 2011, Journal of Computer-Aided Molecular Design.

[9]  Michael K. Gilson,et al.  Blind prediction of host–guest binding affinities: a new SAMPL3 challenge , 2012, Journal of Computer-Aided Molecular Design.

[10]  Traian Sulea,et al.  High incidence of ubiquitin‐like domains in human ubiquitin‐specific proteases , 2007, Proteins.

[11]  R. Rosenthal,et al.  Selective publication of antidepressant trials and its influence on apparent efficacy. , 2008, The New England journal of medicine.

[12]  Matthew T. Geballe,et al.  The SAMPL3 blind prediction challenge: transfer energy overview , 2012, Journal of Computer-Aided Molecular Design.

[13]  Eaton E Lattman,et al.  Seventh Meeting on the Critical Assessment of Techniques for Protein Structure Prediction , 2007, Proteins.

[14]  David L Mobley,et al.  Predicting small-molecule solvation free energies: an informal blind test for computational chemistry. , 2008, Journal of medicinal chemistry.