Ultrascan solution modeler: integrated hydrodynamic parameter and small angle scattering computation and fitting tools

UltraScan Solution Modeler (US-SOMO) processes atomic and lower-resolution bead model representations of biological and other macromolecules to compute various hydrodynamic parameters, such as the sedimentation and diffusion coefficients, relaxation times and intrinsic viscosity, and small angle scattering curves, that contribute to our understanding of molecular structure in solution. Knowledge of biological macromolecules' structure aids researchers in understanding their function as a path to disease prevention and therapeutics for conditions such as cancer, thrombosis, Alzheimer's disease and others. US-SOMO provides a convergence of experimental, computational, and modeling techniques, in which detailed molecular structure and properties are determined from data obtained in a range of experimental techniques that, by themselves, give incomplete information. Our goal in this work is to develop the infrastructure and user interfaces that will enable a wide range of scientists to carry out complicated experimental data analysis techniques on XSEDE. Our user community predominantly consists of biophysics and structural biology researchers. A recent search on PubMed reports 9,205 papers in the decade referencing the techniques we support. We believe our software will provide these researchers a convenient and unique framework to refine structures, thus advancing their research. The computed hydrodynamic parameters and scattering curves are screened against experimental data, effectively pruning potential structures into equivalence classes. Experimental methods may include analytical ultracentrifugation, dynamic light scattering, small angle X-ray and neutron scattering, NMR, fluorescence spectroscopy, and others. One source of macromolecular models is X-ray crystallography. However, the conformation in solution may not match that observed in the crystal form. Using computational techniques, an initial fixed model can be expanded into a search space utilizing high temperature molecular dynamic approaches or stochastic methods such as Brownian dynamics. The number of structures produced can vary greatly, ranging from hundreds to tens of thousands or more. This introduces a number of cyberinfrastructure challenges. Computing hydrodynamic parameters and small angle scattering curves can be computationally intensive for each structure, and therefore cluster compute resources are essential for timely results. Input and output data sizes can vary greatly from less than 1 MB to 2 GB or more. Although the parallelization is trivial, along with data size variability there is a large range of compute sizes, ranging from one to potentially thousands of cores with compute time of minutes to hours. In addition to the distributed computing infrastructure challenges, an important concern was how to allow a user to conveniently submit, monitor and retrieve results from within the C++/Qt GUI application while maintaining a method for authentication, approval and registered publication usage throttling. Middleware supporting these design goals has been integrated into the application with assistance from the Open Gateway Computing Environments (OGCE) collaboration team. The approach was tested on various XSEDE clusters and local compute resources. This paper reviews current US-SOMO functionality and implementation with a focus on the newly deployed cluster integration.

[1]  José García de la Torre,et al.  SIMUFLEX: Algorithms and Tools for Simulation of the Conformation and Dynamics of Flexible Molecules and Nanoparticles in Dilute Solution. , 2009, Journal of chemical theory and computation.

[2]  Borries Demeler,et al.  Parsimonious regularization using genetic algorithms applied to the analysis of analytical ultracentrifugation experiments , 2007, GECCO '07.

[3]  Bernhard A. Muller,et al.  Imatinib and its successors--how modern chemistry has changed drug development. , 2009 .

[4]  W. Ehrenberg,et al.  Small-Angle X-Ray Scattering , 1952, Nature.

[5]  Borries Demeler,et al.  The implementation of SOMO (SOlution MOdeller) in the UltraScan analytical ultracentrifugation data analysis suite: enhanced capabilities allow the reliable hydrodynamic modeling of virtually any kind of biomacromolecule , 2010, European Biophysics Journal.

[6]  Borries Demeler,et al.  Developments in the US-SOMO bead modeling suite: new features in the direct residue-to-bead method, improved grid routines, and influence of accessible surface area screening. , 2010, Macromolecular bioscience.

[7]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[8]  V. Bloomfield,et al.  Hydrodynamic properties of macromolecular complexes. I. Translation , 1977 .

[9]  D. Svergun,et al.  CRYSOL : a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates , 1995 .

[10]  Bridget Carragher,et al.  Nucleotide‐Dependent Conformational Changes in the N‐Ethylmaleimide Sensitive Factor (NSF) and their Potential Role in SNARE Complex Disassembly , 2012, Journal of structural biology.

[11]  Srinath Perera,et al.  Apache airavata: a framework for distributed applications and computational workflows , 2011, GCE '11.

[12]  J. Goldman,et al.  Chronic myeloid leukemia: a historical perspective. , 2010, Seminars in hematology.

[13]  Borries Demeler,et al.  Performance optimization of large non-negatively constrained least squares problems with an application in biophysics , 2010, TG.

[14]  H. Stanley,et al.  Discrete molecular dynamics studies of the folding of a protein-like model. , 1998, Folding & design.

[15]  Borries Demeler,et al.  Genetic algorithm optimization for obtaining accurate molecular weight distributions from sedimentation velocity experiments , 2006 .

[16]  Andrej Sali,et al.  FoXS: a web server for rapid computation and fitting of SAXS profiles , 2010, Nucleic Acids Res..

[17]  R.V. Boppana,et al.  Computing Large Sparse Multivariate Optimization Problems with an Application in Biophysics , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[18]  J. Hubbard,et al.  Hydrodynamic friction and the capacitance of arbitrarily shaped objects. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[19]  O. Byron,et al.  Construction of hydrodynamic bead models from high-resolution X-ray crystallographic or nuclear magnetic resonance data. , 1997, Biophysical journal.

[20]  Kiichi Fukui,et al.  Structural basis for the cooperative interplay between the two causative gene products of combined factor V and factor VIII deficiency , 2010, Proceedings of the National Academy of Sciences.

[21]  J. García de la Torre,et al.  Prediction of hydrodynamic and other solution properties of rigid proteins from atomic- and residue-level models. , 2011, Biophysical journal.

[22]  Bernhard Müller,et al.  Imatinib and its successors--how modern chemistry has changed drug development. , 2009, Current pharmaceutical design.

[23]  D I Svergun,et al.  Determination of domain structure of proteins from X-ray solution scattering. , 2001, Biophysical journal.

[24]  Borries Demeler,et al.  A two-dimensional spectrum analysis for sedimentation velocity experiments of mixtures with heterogeneity in molecular weight and shape , 2010, European Biophysics Journal.

[25]  Suresh Marru,et al.  Open grid computing environments: advanced gateway support activities , 2010 .

[26]  David J. Scott,et al.  UltraScan - A Comprehensive Data Analysis Software Package for Analytical Ultracentrifugation Experiments , 2005 .

[27]  J. García de la Torre,et al.  Hydrodynamic properties of complex, rigid, biological macromolecules: theory and applications , 1981, Quarterly Reviews of Biophysics.

[28]  Mattia Rocco,et al.  Solution properties of full‐length integrin αIIbβ3 refined models suggest environment‐dependent induction of alternative bent /extended resting states , 2010, The FEBS journal.

[29]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[30]  E. Garboczi,et al.  Intrinsic viscosity and the electrical polarizability of arbitrarily shaped objects. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Olwyn Byron,et al.  SOMO (SOlution MOdeler) differences between X-Ray- and NMR-derived bead models suggest a role for side chain flexibility in protein hydrodynamics. , 2005, Structure.

[32]  Feng Ding,et al.  Emergence of Protein Fold Families through Rational Design , 2006, PLoS Comput. Biol..

[33]  D I Svergun,et al.  Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing. , 1999, Biophysical journal.

[34]  O. Glatter,et al.  19 – Small-Angle X-ray Scattering , 1973 .

[35]  Ankur Goyal,et al.  Open community development for science gateways with apache rave , 2011, GCE '11.

[36]  Nancy Wilkins-Diehr,et al.  Open grid computing environments , 2009 .