The structural bioinformatics library: modeling in biomolecular science and beyond

Motivation: Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off‐the‐shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library (SBL, http://sbl.inria.fr), a generic C ++/python cross‐platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. Results: The SBL involves four software components (1–4 thereafter). For end‐users, the SBL provides ready to use, state‐of‐the‐art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro‐molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms, (3) biophysical models and (4) modules, the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. Availability and Implementation: The SBL is available from http://sbl.inria.fr Contact: Frederic.Cazals@inria.fr Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Ross C. Walker,et al.  An overview of the Amber biomolecular simulation package , 2013 .

[2]  Frédéric Cazals,et al.  High‐resolution crystal structures leverage protein binding affinity predictions , 2016, Proteins.

[3]  Andrei Alexandrescu,et al.  Modern C++ design: generic programming and design patterns applied , 2001 .

[4]  C. Robinson,et al.  Protein complexes in the gas phase: technology for structural genomics and proteomics. , 2007, Chemical reviews.

[5]  N. Shah,et al.  Greedy Geometric Algorithms for Collection of Balls, with Applications to Geometric Approximation and Molecular Coarse‐Graining , 2014, Comput. Graph. Forum.

[6]  Anna Vangone,et al.  Contacts-based prediction of binding affinity in protein–protein complexes , 2015, eLife.

[7]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[8]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[9]  M. Levitt,et al.  A geometric knowledge-based coarse-grained scoring potential for structure prediction evaluation , 2009 .

[10]  Frank Noé,et al.  An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation , 2014, Advances in Experimental Medicine and Biology.

[11]  Leonidas J. Guibas,et al.  Persistence-Based Clustering in Riemannian Manifolds , 2013, JACM.

[12]  Frédéric Cazals,et al.  Probing a continuum of macro‐molecular assembly models with graph templates of complexes , 2013, Proteins.

[13]  Gerrit Groenhof,et al.  GROMACS: Fast, flexible, and free , 2005, J. Comput. Chem..

[14]  Frédéric Cazals,et al.  Modeling macro-molecular interfaces with Intervor , 2010, Bioinform..

[15]  David Coudert,et al.  Unveiling Contacts within Macromolecular Assemblies by Solving Minimum Weight Connectivity Inference (MWC) Problems* , 2015, Molecular & Cellular Proteomics.

[16]  Frank Alber,et al.  Integrating diverse data for structure determination of macromolecular assemblies. , 2008, Annual review of biochemistry.

[17]  Frédéric Cazals,et al.  Computing the volume of a union of balls: A certified algorithm , 2011, TOMS.

[18]  D. Wales Energy Landscapes by David Wales , 2004 .

[19]  Frédéric Cazals,et al.  Assessing the reconstruction of macromolecular assemblies with toleranced models , 2012, Proteins.

[20]  Frédéric Cazals,et al.  Hybridizing rapidly exploring random trees and basin hopping yields an improved exploration of energy landscapes , 2016, J. Comput. Chem..

[21]  Frédéric Cazals,et al.  Revisiting the Voronoi Description of Protein-Protein Interfaces: Algorithms , 2010, PRIB.

[22]  Dorian Mazauric,et al.  Conformational ensembles and sampled energy landscapes: Analysis and comparison , 2015, J. Comput. Chem..

[23]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using MODELLER , 2007, Current protocols in protein science.

[24]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[25]  Z. Weng,et al.  A structure‐based benchmark for protein–protein binding affinity , 2011, Protein science : a publication of the Protein Society.

[26]  Bernard Manderick,et al.  PDB file parser and structure class implemented in Python , 2003, Bioinform..

[27]  B. Roux,et al.  Calculation of absolute protein-ligand binding free energy from computer simulations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Haim J. Wolfson,et al.  DockStar: a novel ILP-based integrative method for structural modeling of multimolecular protein complexes , 2015, Bioinform..

[29]  Donald E. Knuth,et al.  Computer programming as an art , 1974, CACM.

[30]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using MODELLER , 2016, Current protocols in bioinformatics.

[31]  J. Janin,et al.  Revisiting the Voronoi description of protein–protein interfaces , 2006, Protein science : a publication of the Protein Society.

[32]  Mauno Vihinen,et al.  No more hidden solutions in bioinformatics , 2015, Nature.

[33]  Frédéric Cazals,et al.  Assessing the Reconstruction of Macro-molecular Assemblies: the Example of the Nuclear Pore Complex , 2011 .

[34]  Herbert Edelsbrunner,et al.  Computational Topology - an Introduction , 2009 .

[35]  M. Gilson,et al.  Calculation of protein-ligand binding affinities. , 2007, Annual review of biophysics and biomolecular structure.

[36]  Ruth Nussinov,et al.  Principles of docking: An overview of search algorithms and a guide to scoring functions , 2002, Proteins.

[37]  Haim J. Wolfson,et al.  DockStar: A Novel ILP Based Integrative Method for Structural Modelling of Multimolecular Protein Complexes (Extended Abstract) , 2015, RECOMB.

[38]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[39]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[40]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[41]  K. Binder,et al.  A Guide to Monte Carlo Simulations in Statistical Physics , 2000 .

[42]  Massimiliano Bonomi,et al.  Modeling of proteins and their assemblies with the Integrative Modeling Platform. , 2014, Methods in molecular biology.

[43]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[44]  Rumen Andonov,et al.  CSA: comprehensive comparison of pairwise protein structure alignments , 2012, Nucleic Acids Res..

[45]  D. Wales,et al.  Energy landscapes and persistent minima. , 2016, The Journal of chemical physics.

[46]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[47]  M Gerstein,et al.  Protein geometry: volumes, areas and distances , 2012 .

[48]  Herbert Edelsbrunner,et al.  Geometry and Topology for Mesh Generation , 2001, Cambridge monographs on applied and computational mathematics.

[49]  Herbert Edelsbrunner,et al.  Geometry and Topology for Mesh Generation , 2001, Cambridge monographs on applied and computational mathematics.