A declarative concurrent system for protein structure prediction on GPU

This paper provides a novel perspective in the protein structure prediction (PSP) problem. The PSP problem focuses on determining putative 3D structures of a protein starting from its primary sequence. The proposed approach relies on a multi-agent system (MAS) perspective, where concurrent agents explore the folding of different parts of a protein. The strength of the approach lies in the agents’ ability to apply different types of knowledge, expressed in the form of declarative constraints, to prune the search space of folding alternatives. The paper makes also an important contribution in demonstrating the suitability of a general-purpose graphical processing unit approach to implement such MAS infrastructure, with significant performance improvements over the sequential implementation and other methods.

[1]  Ajay K. Royyuru,et al.  Blue Gene: A vision for protein science using a petaflop supercomputer , 2001, IBM Syst. J..

[2]  Dipti Srinivasan,et al.  An Introduction to Multi-Agent Systems , 2010 .

[3]  Alessandro Dal Palù,et al.  A Constraint Solver for Flexible Protein Model , 2013, J. Artif. Intell. Res..

[4]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[5]  Luigi Palopoli,et al.  Coopps: a System for the Cooperative Prediction of Protein Structures , 2004, J. Bioinform. Comput. Biol..

[6]  A. Giuliani,et al.  A computational approach identifies two regions of Hepatitis C Virus E1 protein as interacting domains involved in viral fusion process , 2009, BMC Structural Biology.

[7]  M. Hao,et al.  Designing potential energy functions for protein folding. , 1999, Current opinion in structural biology.

[8]  A. Fiser Template-based protein structure modeling. , 2010, Methods in molecular biology.

[9]  Alessandro Dal Palù,et al.  Computing approximate solutions of the protein structure determination problem using global constraints on discrete crystal lattices , 2010, Int. J. Data Min. Bioinform..

[10]  András Fiser,et al.  New statistical potential for quality assessment of protein models and a survey of energy functions , 2010, BMC Bioinformatics.

[11]  Alessandro Dal Palù,et al.  CUD@SAT: SAT solving on GPUs , 2015, J. Exp. Theor. Artif. Intell..

[12]  Rolf Backofen,et al.  A Constraint-Based Approach to Fast and Exact Structure Prediction in Three-Dimensional Protein Models , 2006, Constraints.

[13]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[14]  Tanja Kortemme,et al.  Potential functions for hydrogen bonds in protein structure prediction and design. , 2005, Advances in protein chemistry.

[15]  Agostino Dovier,et al.  Protein Structure Prediction on GPU: A Declarative Approach in a Multi-agent Framework , 2013, 2013 42nd International Conference on Parallel Processing.

[16]  Jaap Heringa,et al.  Protein secondary structure prediction. , 2010, Methods in molecular biology.

[17]  Alessandro Dal Palù,et al.  A constraint solver for flexible protein models , 2013 .

[18]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[19]  Hoong Chuin Lau,et al.  Stochastic dominance in stochastic DCOPs for risk-sensitive applications , 2012, AAMAS.

[20]  Federico Fogolari,et al.  Amino acid empirical contact energy definitions for fold recognition in the space of contact maps , 2003, BMC Bioinformatics.

[21]  Pedro Pablo González Pérez,et al.  Multi-Agent Systems Applied in the Modeling and Simulation of Biological Problems: A Case Study in Protein Folding , 2009 .

[22]  Pedro Barahona,et al.  Constraint Programming in Structural Bioinformatics , 2007, Constraints.

[23]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[24]  Andreas Hildebrandt,et al.  Highly accelerated feature detection in proteomics data sets using modern graphics processing units , 2009, Bioinform..

[25]  Daniel Fischer,et al.  3D‐SHOTGUN: A novel, cooperative, fold‐recognition meta‐predictor , 2003, Proteins.

[26]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[27]  El-Ghazali Talbi,et al.  Large neighborhood local search optimization on graphics processing units , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[28]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[29]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[30]  Jie Cheng,et al.  CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..

[31]  Amitabh Varshney,et al.  High-throughput sequence alignment using Graphics Processing Units , 2007, BMC Bioinformatics.

[32]  Alessandro Dal Palù,et al.  Chapter 3:Protein Structure Analysis with Constraint Programming , 2012 .

[33]  Alessandro Dal Palù,et al.  Constraint Logic Programming approach to protein structure prediction , 2004, BMC Bioinformatics.

[34]  Christian Cole,et al.  The Jpred 3 secondary structure prediction server , 2008, Nucleic Acids Res..

[35]  Gabriela Czibula,et al.  Solving the Protein Folding Problem Using a Distributed Q-Learning Approach , 2011 .

[36]  Levi C. T. Pierce,et al.  Routine Access to Millisecond Time Scale Events with Accelerated Molecular Dynamics , 2012, Journal of chemical theory and computation.

[37]  J. Skolnick,et al.  Ab initio modeling of small proteins by iterative TASSER simulations , 2007, BMC Biology.

[38]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[39]  V. S. Costa,et al.  Theory and Practice of Logic Programming , 2010 .

[40]  Y. Shoham Introduction to Multi-Agent Systems , 2002 .

[41]  Vijay S. Pande,et al.  Folding@home: Lessons from eight years of volunteer distributed computing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[42]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[43]  Paul Shaw,et al.  Using Constraint Programming and Local Search Methods to Solve Vehicle Routing Problems , 1998, CP.

[44]  Agostino Dovier,et al.  A GPU Implementation of Large Neighborhood Search for Solving Constraint Optimization Problems , 2014, ECAI.

[45]  Savitri Bevinakoppa,et al.  Dihedral angle and secondary structure database of short amino acid fragments , 2006, Bioinformation.

[46]  Simon Levin Computational Molecular Biology An Introduction , 2000 .

[47]  Nicholas Carriero,et al.  Linda in context , 1989, CACM.

[48]  J. Davies,et al.  Molecular Biology of the Cell , 1983, Bristol Medico-Chirurgical Journal.

[49]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[50]  F. Fogolari,et al.  Modeling of polypeptide chains as C alpha chains, C alpha chains with C beta, and C alpha chains with ellipsoidal lateral chains. , 1996, Biophysical journal.

[51]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[52]  Alessandro Dal Palù,et al.  CLP-based protein fragment assembly* , 2010, Theory and Practice of Logic Programming.

[53]  Michael Wooldridge,et al.  Introduction to Multi-Agent Systems , 2016 .

[54]  Agostino Dovier,et al.  Agent-based protein structure prediction , 2007, Multiagent Grid Syst..

[55]  Martin C. Herbordt,et al.  GPU acceleration of a production molecular docking code , 2009, GPGPU-2.

[56]  Alessandro Dal Palù,et al.  Exploring Protein Fragment Assembly Using CLP , 2011, IJCAI.

[57]  Sara Klemin Molecular Biology of the Cell, Molecular Biology of the Cell: The Problems Book , 2008, The Yale Journal of Biology and Medicine.