FROST: revisited and distributed

FROST (Fold Recognition-Oriented Search Tool) [A. Marin et al., (2002)] is a software whose purpose is to assign a 3D structure to a protein sequence. It is based, on a series of filters and uses a database of about 1200 known 3D structures, each one associated, with empirically determined score distributions. FROST uses these distributions to normalize the score obtained, when a protein sequence is aligned, with a particular 3D structure. Computing these distributions is extremely time consuming; it requires solving about 1,200,000 hard combinatorial optimization problems and takes about 40 days on a 2.4 GHz computer. This paper describes how FROST has been successfully redesigned and structured, in modules and independent tasks. The new package organization allows these tasks to be distributed and executed, in parallel using a centralized, dynamic load balancing strategy. On a cluster of 12 PCs, computing the score distributions takes now about 3 days, which represents a parallelization efficiency of about 1.

[1]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[2]  Harvey J. Greenberg,et al.  Opportunities for Combinatorial Optimization in Computational Biology , 2004, INFORMS J. Comput..

[3]  David T. Jones,et al.  Protein superfamilles and domain superfolds , 1994, Nature.

[4]  Teresa Head-Gordon,et al.  Computational challenges in structural and functional genomics , 2001, IBM Syst. J..

[5]  Ying Xu,et al.  Raptor: Optimal Protein Threading by Linear Programming , 2003, J. Bioinform. Comput. Biol..

[6]  Satoru Miyano,et al.  On the approximation of protein threading , 1997, RECOMB '97.

[7]  Stefan Balev Solving the Protein Threading Problem by Lagrangian Relaxation , 2004, WABI.

[8]  Jean-François Gibrat,et al.  FROST: A filter‐based fold recognition method , 2002, Proteins.

[9]  Y Xu,et al.  Protein threading using PROSPECT: Design and evaluation , 2000, Proteins.

[10]  Jinbo Xu Speedup LP Approach to Protein Threading via Graph Reduction , 2003, WABI.

[11]  C. Chothia Proteins. One thousand families for the molecular biologist. , 1992, Nature.

[12]  Thomas Lengauer,et al.  Computational Biology at the Beginning of the Post-genomic Era , 2001, Informatics.

[13]  Rumen Andonov,et al.  Protein Threading: From Mathematical Models to Parallel Implementations , 2004, INFORMS J. Comput..

[14]  Rumen Andonov,et al.  Parallel divide and conquer approach for the protein threading problem , 2004, Concurr. Pract. Exp..

[15]  Temple F. Smith,et al.  Global optimum protein threading with gapped alignment and empirical pair score functions. , 1996, Journal of molecular biology.

[16]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[17]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[18]  Jinbo Xu Protein Structure Prediction by Linear Programming , 2003 .

[19]  Ying Xu,et al.  An Efficient Computational Method for Globally Optimal Threading , 1998, J. Comput. Biol..