Augmenting Basin-Hopping With Techniques From Unsupervised Machine Learning: Applications in Spectroscopy and Ion Mobility

Evolutionary algorithms such as the basin-hopping (BH) algorithm have proven to be useful for difficult non-linear optimization problems with multiple modalities and variables. Applications of these algorithms range from characterization of molecular states in statistical physics and molecular biology to geometric packing problems. A key feature of BH is the fact that one can generate a coarse-grained mapping of a potential energy surface (PES) in terms of local minima. These results can then be utilized to gain insights into molecular dynamics and thermodynamic properties. Here we describe how one can employ concepts from unsupervised machine learning to augment BH PES searches to more efficiently identify local minima and the transition states connecting them. Specifically, we introduce the concepts of similarity indices, hierarchical clustering, and multidimensional scaling to the BH methodology. These same machine learning techniques can be used as tools for interpreting and rationalizing experimental results from spectroscopic and ion mobility investigations (e.g., spectral assignment, dynamic collision cross sections). We exemplify this in two case studies: (1) assigning the infrared multiple photon dissociation spectrum of the protonated serine dimer and (2) determining the temperature-dependent collision cross-section of protonated alanine tripeptide.

[1]  W. S. Hopkins,et al.  Proton-bound 3-cyanophenylalanine trimethylamine clusters: isomer-specific fragmentation pathways and evidence of gas-phase zwitterions. , 2013, The journal of physical chemistry. A.

[2]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[3]  Erkinjon G. Nazarov,et al.  Differential mobility spectrometer: Model of operation , 2007 .

[4]  Guochun Yang,et al.  Chirality recognition of the protonated serine dimer and octamer by infrared multiphoton dissociation spectroscopy. , 2013, Physical chemistry chemical physics : PCCP.

[5]  H H Hill,et al.  Ion mobility spectrometry. , 1990, Analytical chemistry.

[6]  James E. Gentle,et al.  Matrix Algebra: Theory, Computations, and Applications in Statistics , 2007 .

[7]  M. Andersson,et al.  New scale factors for harmonic vibrational frequencies using the B3LYP density functional method with the triple-zeta basis set 6-311+G(d,p). , 2005, The journal of physical chemistry. A.

[8]  Marta A. S. Perez,et al.  The Structure of the Protonated Serine Octamer. , 2018, Journal of the American Chemical Society.

[9]  Z. Tian,et al.  Gas-phase versus liquid-phase structures by electrospray ionization mass spectrometry. , 2009, Angewandte Chemie.

[10]  H. Schlegel,et al.  Combining Synchronous Transit and Quasi-Newton Methods to Find Transition States , 1993 .

[11]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[12]  J. Roithová,et al.  Infrared Multiphoton Dissociation Spectroscopy with Free-Electron Lasers: On the Road from Small Molecules to Biomolecules. , 2018, Chemistry.

[13]  A. Kung,et al.  Progressive stabilization of zwitterionic structures in [H(Ser)(2-8)]+ studied by infrared photodissociation spectroscopy. , 2006, Angewandte Chemie.

[14]  J. Campbell,et al.  Probing electrospray ionization dynamics using differential mobility spectrometry: the curious case of 4-aminobenzoic acid. , 2012, Analytical chemistry.

[15]  Tetsuya Taketsugu,et al.  Exploring transition state structures for intramolecular pathways by the artificial force induced reaction method , 2014, J. Comput. Chem..

[16]  Hanna Vehkamäki,et al.  Classical Nucleation Theory in Multicomponent Systems , 2006 .

[17]  A. Simon,et al.  Gas phase infrared spectroscopy of selectively prepared ions. , 2002, Physical review letters.

[18]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[19]  F. McLafferty,et al.  Infrared photodissociation spectroscopy of electrosprayed ions in a Fourier transform mass spectrometer. , 2005, Journal of the American Chemical Society.

[20]  Edward A. Mason,et al.  Transport Properties of Gaseous Ions over a Wide Energy Range , 1976 .

[21]  C. Robinson,et al.  Ion mobility mass spectrometry of peptide ions: effects of drift gas and calibration strategies. , 2012, Analytical chemistry.

[22]  W. S. Hopkins,et al.  A parallelized molecular collision cross section package with optimized accuracy and efficiency. , 2019, The Analyst.

[23]  F. Calvo,et al.  Accurate modeling of infrared multiple photon dissociation spectra: the dynamical role of anharmonicities. , 2013, The journal of physical chemistry. A.

[24]  Mark A. Miller,et al.  Archetypal energy landscapes , 1998, Nature.

[25]  B. Chowdhry,et al.  Ion mobility spectrometry-mass spectrometry (IMS-MS) of small molecules: separating and assigning structures to ions. , 2013, Mass spectrometry reviews.

[26]  L. Piela,et al.  On the stability of conformers , 1994 .

[27]  Roger Guevremont,et al.  Atmospheric pressure ion focusing in a high-field asymmetric waveform ion mobility spectrometer , 1999 .

[28]  J. Oomens,et al.  Structural characterization by infrared multiple photon dissociation spectroscopy of protonated gas-phase ions obtained by electrospray ionization of cysteine and dopamine , 2011, Analytical and bioanalytical chemistry.

[29]  P. Armentrout,et al.  Metal cation dependence of interactions with amino acids: bond dissociation energies of Rb(+) and Cs(+) to the acidic amino acids and their amide derivatives. , 2014, The journal of physical chemistry. B.

[30]  M. Lee,et al.  Developments in ion mobility spectrometry–mass spectrometry , 2002, Analytical and bioanalytical chemistry.

[31]  K. Pagel,et al.  Side-chain effects on the structures of protonated amino acid dimers: A gas-phase infrared spectroscopy study , 2017, International Journal of Mass Spectrometry.

[32]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[33]  Oliver Fiehn,et al.  Toward Merging Untargeted and Targeted Methods in Mass Spectrometry-Based Metabolomics and Lipidomics. , 2016, Analytical chemistry.

[34]  W. S. Hopkins,et al.  Unravelling the factors that drive separation in differential mobility spectrometry: A case study of regioisomeric phosphatidylcholine adducts , 2019, International Journal of Mass Spectrometry.

[35]  K. Pagel,et al.  Infrared spectrum and structure of the homochiral serine octamer-dichloride complex. , 2017, Nature chemistry.

[36]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[37]  Harold A. Scheraga,et al.  Some approaches to the multiple‐minima problem in the calculation of polypeptide and protein structures , 1992 .

[38]  W. S. Hopkins,et al.  Density functional theory study of Rh(n)S(0,±) and Rh(n+1)(0,±) (n = 1-9). , 2014, The journal of physical chemistry. A.

[39]  Edward A. Mason,et al.  Transport Properties of Ions in Gases: MASON:TRANSPORT PROPERTIES O-BK , 2005 .

[40]  W. S. Hopkins,et al.  Applying Machine Learning to Vibrational Spectroscopy. , 2018, The journal of physical chemistry. A.

[41]  Ashutosh Kumar,et al.  Advances in the Development of Shape Similarity Methods and Their Application in Drug Discovery , 2018, Front. Chem..

[42]  E. W. McDaniel,et al.  Transport Properties of Ions in Gases , 1988 .

[43]  Robert H. Leary,et al.  Global Optimization on Funneling Landscapes , 2000, J. Glob. Optim..

[44]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[45]  P. Maître,et al.  Infrared spectroscopy of organometallic ions in the gas phase: from model to real world complexes. , 2007, Mass spectrometry reviews.

[46]  S. Coy,et al.  Temperature effects in differential mobility spectrometry , 2009 .

[47]  P. Groenen,et al.  Modern multidimensional scaling , 1996 .

[48]  Warren K. Mino,et al.  Gas-Phase Structure and Dissociation Chemistry of Protonated Tryptophan Elucidated by Infrared Multiple-Photon Dissociation Spectroscopy , 2011 .

[49]  Marco Locatelli,et al.  On the Multilevel Structure of Global Optimization Problems , 2005, Comput. Optim. Appl..

[50]  C. Eyers,et al.  The power of ion mobility-mass spectrometry for structural characterization and the study of conformational dynamics. , 2014, Nature chemistry.

[51]  Richard D. Smith,et al.  Control of ion distortion in field asymmetric waveform ion mobility spectrometry via variation of dispersion field and gas temperature. , 2008, Analytical chemistry.

[52]  Nick C. Polfer,et al.  Infrared multiple photon dissociation spectroscopy of trapped ions. , 2011, Chemical Society reviews.

[53]  Martin Mozina,et al.  Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..

[54]  R. Cooks,et al.  Serine octamers: cluster formation, reactions, and implications for biomolecule homochirality. , 2006, Angewandte Chemie.

[55]  Fabio Schoen,et al.  Global Optimization: Theory, Algorithms, and Applications , 2013 .

[56]  Andreas H. Göller Dataset overlap density analysis , 2013, Journal of Cheminformatics.

[57]  P. Kollman,et al.  Automatic atom type and bond type perception in molecular mechanical calculations. , 2006, Journal of molecular graphics & modelling.

[58]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[59]  Dmitry Yu. Zubarev,et al.  Global minimum structure searches via particle swarm optimization , 2007, J. Comput. Chem..

[60]  J. Bertrán,et al.  Protonation of glycine, serine and cysteine. Conformations, proton affinities and intrinsic basicities , 2001 .

[61]  H. Kjaergaard,et al.  The OH-stretching and OOH-bending overtone spectrum of HOONO. , 2005, The Journal of chemical physics.

[62]  A. Simon,et al.  Mid-IR spectroscopy of protonated leucine methyl ester performed with an FTICR or a Paul type ion-trap , 2006 .

[63]  W. S. Hopkins Determining the properties of gas-phase clusters , 2015 .

[64]  Giuseppe Astarita,et al.  Metabolomics and lipidomics using traveling-wave ion mobility mass spectrometry , 2017, Nature Protocols.

[65]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[66]  A. Roitberg,et al.  Crown complexation of protonated amino acids: influence on IRMPD spectra. , 2013, The journal of physical chemistry. A.

[67]  Rainer Schrader,et al.  Small Molecule Subgraph Detector (SMSD) toolkit , 2009, J. Cheminformatics.

[68]  G. Berden,et al.  Competition between salt bridge and non-zwitterionic structures in deprotonated amino acid dimers. , 2018, Physical chemistry chemical physics : PCCP.

[69]  Juan Ren,et al.  Structural characterizations of protonated homodimers of amino acids: Revealed by infrared multiple photon dissociation (IRMPD) spectroscopy and theoretical calculations , 2018, Chinese Chemical Letters.

[70]  Andreas Ziehe,et al.  Learning Invariant Representations of Molecules for Atomization Energy Prediction , 2012, NIPS.

[71]  A. Becke,et al.  Density-functional exchange-energy approximation with correct asymptotic behavior. , 1988, Physical review. A, General physics.

[72]  W. S. Hopkins,et al.  Studying Gas-Phase Interconversion of Tautomers Using Differential Mobility Spectrometry , 2016, Journal of The American Society for Mass Spectrometry.

[73]  E. Nazarov,et al.  Electric field dependence of the ion mobility , 2009 .

[74]  W. S. Hopkins,et al.  Mode-specific fragmentation of amino acid-containing clusters. , 2015, Physical chemistry chemical physics : PCCP.

[75]  Jun Zhu,et al.  Possible lower energy isomer of carbon clusters C (n = 11, 12) via particle swarm optimization algorithm: Ab initio investigation , 2019, Chemical Physics Letters.

[76]  Z. Tian,et al.  Does electrospray ionization produce gas-phase or liquid-phase structures? , 2008, Journal of the American Chemical Society.

[77]  Adrià Cereto-Massagué,et al.  Molecular fingerprint similarity search in virtual screening. , 2015, Methods.

[78]  X. Zeng,et al.  Formation free energy of clusters in vapor-liquid nucleation: A Monte Carlo simulation study , 1999 .

[79]  Michael J. Frisch,et al.  Using redundant internal coordinates to optimize equilibrium geometries and transition states , 1996, J. Comput. Chem..

[80]  J. Doye,et al.  Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms , 1997, cond-mat/9803344.

[81]  H. Hill,et al.  Correcting the fundamental ion mobility equation for field effects. , 2016, The Analyst.

[82]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[83]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[84]  D. Clemmer,et al.  Magic Number Clusters of Serine in the Gas Phase , 2001 .

[85]  S. Grimme,et al.  A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. , 2010, The Journal of chemical physics.

[86]  I. Gràcia,et al.  Review on ion mobility spectrometry. Part 1: current instrumentation. , 2015, The Analyst.

[87]  C. Breneman,et al.  Determining atom‐centered monopoles from molecular electrostatic potentials. The need for high sampling density in formamide conformational analysis , 1990 .

[88]  Amarda Shehu,et al.  Basin Hopping as a General and Versatile Optimization Framework for the Characterization of Biological Macromolecules , 2012, Adv. Artif. Intell..

[89]  R. Sokal,et al.  A QUANTITATIVE APPROACH TO A PROBLEM IN CLASSIFICATION† , 1957, Evolution; International Journal of Organic Evolution.

[90]  D. Wales,et al.  Mutational Basin-Hopping: Combined Structure and Sequence Optimization for Biomolecules. , 2018, The journal of physical chemistry letters.

[91]  J. Staymates,et al.  Reliability of ion mobility spectrometry for qualitative analysis of complex, multicomponent illicit drug samples. , 2011, Forensic science international.

[92]  H. Scheraga,et al.  Global optimization of clusters, crystals, and biomolecules. , 1999, Science.

[93]  Roger Guevremont,et al.  High-field asymmetric waveform ion mobility spectrometry: a new tool for mass spectrometry. , 2004, Journal of chromatography. A.

[94]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[95]  A. Becke Density-functional thermochemistry. III. The role of exact exchange , 1993 .

[96]  J. Oomens,et al.  Gas-phase IR spectroscopy of deprotonated amino acids. , 2009, Journal of the American Chemical Society.

[97]  E. Fillion,et al.  The structures and properties of proton- and alkali-bound cysteine dimers. , 2016, Physical chemistry chemical physics : PCCP.

[98]  Encoding Rules,et al.  SMILES, a Chemical Language and Information System. 1. Introduction to Methodology , 1988 .

[99]  Woo Youn Kim,et al.  Efficient Basin-Hopping Sampling of Reaction Intermediates through Molecular Fragmentation and Graph Theory. , 2014, Journal of chemical theory and computation.

[100]  S. Kass,et al.  Infrared multiphoton dissociation spectroscopy study of protonated p-aminobenzoic acid: does electrospray ionization afford the amino- or carboxy-protonated ion? , 2011, The journal of physical chemistry. A.