Computational Protein Design: trying an Answer Set Programming approach to solve the problem

Proteins are macromolecules made of a chain of amino-acids. The combinatorial nature of the space of possible protein conformations makes computer-aided protein study a major research field in bioinformatics. The problem of \textit{computational protein design} aims at finding the best protein conformation to perform a given task. This problem can be reduced to an optimization problem, looking for the minimum of an energy function depending on the amino-acid interactions in the protein. We have designed a model based on Answer Set Programming. The CPD problem may be easily modeled as an ASP program but a practical implementation able to work on real-sized instances has never been published. We have raised the main source of difficulty for current ASP solvers and ran a series of benchmarks highlighting the importance of finding a good upper bound estimation of the target minimum energy to reduce the amount of combinatorial search. Our solution clearly outperforms a direct ASP implementation without this estimation and has comparable performances with respect to SAT-based approaches. It remains less efficient that the recent approach by cost function networks of D. Allouche & al., showing there exists still some place for improving the optimization component in ASP with more dynamical strategies.

[1]  Roland H. C. Yap,et al.  An optimal coarse-grained arc consistency algorithm , 2005, Artif. Intell..

[2]  Samuel L. DeLuca,et al.  Practically Useful: What the Rosetta Protein Modeling Suite Can Do for You , 2010, Biochemistry.

[3]  Roland L. Dunbrack,et al.  A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. , 2011, Structure.

[4]  Bruce Randall Donald,et al.  Protein Design Using Continuous Rotamers , 2012, PLoS Comput. Biol..

[5]  Vladimir Lifschitz,et al.  Answer Set Programming , 2019 .

[6]  Hao Hu,et al.  A gradient‐directed Monte Carlo approach for protein design , 2010, J. Comput. Chem..

[7]  Martin C. Cooper,et al.  Arc consistency for soft constraints , 2004, Artif. Intell..

[8]  Niles A Pierce,et al.  Protein design is NP-hard. , 2002, Protein engineering.

[9]  Luis P. B. Scott,et al.  Using genetic algorithm to design protein sequence , 2008, Appl. Math. Comput..

[10]  Tanja Kortemme,et al.  SAT-based protein design , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[11]  Simon de Givry,et al.  Computational protein design as an optimization problem , 2014, Artif. Intell..

[12]  Roberto A. Chica,et al.  Iterative approach to computational enzyme design , 2012, Proceedings of the National Academy of Sciences.

[13]  Mark A Hallen,et al.  Dead‐end elimination with perturbations (DEEPer): A provable protein design algorithm with continuous sidechain and backbone flexibility , 2013, Proteins.

[14]  J. Pleiss Protein design in metabolic engineering and synthetic biology. , 2011, Current opinion in biotechnology.

[15]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[16]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[17]  Martin Gebser,et al.  Clingo = ASP + Control: Preliminary Report , 2014, ArXiv.

[18]  Yushan Zhu Mixed-Integer Linear Programming Algorithm for a Computational Protein Design Problem , 2007 .

[19]  Simon de Givry,et al.  A new framework for computational protein design through cost function network optimization , 2013, Bioinform..

[20]  Agostino Dovier,et al.  A Comparison of CLP(FD) and ASP Solutions to NP-Complete Problems , 2005, ICLP.

[21]  J. Keasling,et al.  Engineering a mevalonate pathway in Escherichia coli for production of terpenoids , 2003, Nature Biotechnology.

[22]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.

[23]  Dan S. Tawfik,et al.  Protein engineers turned evolutionists , 2007, Nature Methods.

[24]  Simon de Givry,et al.  Computational Protein Design as a Cost Function Network Optimization Problem , 2012, CP.

[25]  David T. Jones,et al.  De novo protein design using pairwise potentials and a genetic algorithm , 1994, Protein science : a publication of the Protein Society.