A branch-and-bound algorithm for optimal protein threading with pairwise (contact potential) amino acid interactions

Presents a new branch-and-bound method of searching the space of possible "threadings" for the optimal match of a sequence to an adjacency matrix of environments in the "motif threading" version of the "inverse protein folding problem." The method is guaranteed to find the optimal threading first, and thereafter will enumerate successive candidate threadings in order of decreasing optimality. We require minimal conditions on how environments are defined and the form of the score function, and the search method is sufficiently general to be used with many different score functions which model contact potentials or other interactions between explicit pairs of amino acids. This algorithm has been used in conjunction with a pairwise interaction score function to identify the optimal threading out of as many as 1.69/spl times/10/sup 24/ possibilities on a Sun Sparcstation IPC workstation in 40 minutes total elapsed time.<<ETX>>

[1]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[2]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[3]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[4]  C. Pabo Molecular technology: Designing proteins and peptides , 1983, Nature.

[5]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[6]  R. Starzyk,et al.  Evidence for dispensable sequences inserted into a nucleotide fold. , 1987, Science.

[7]  M. N. Vyas,et al.  Sugar and signal-transducer binding sites of the Escherichia coli galactose chemoreceptor protein. , 1988, Science.

[8]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[9]  C. Orengo,et al.  A rapid method of protein structure alignment. , 1990, Journal of theoretical biology.

[10]  J. Greer Comparative modeling methods: Application to the family of the mammalian serine proteases , 1990, Proteins.

[11]  G. Casari,et al.  Identification of native protein folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force. , 1990, Journal of molecular biology.

[12]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[13]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[14]  G. Crippen Prediction of protein folding from amino acid sequence over discrete conformation spaces. , 1991, Biochemistry.

[15]  Richard H. Lathrop,et al.  ARIEL: a massively parallel symbolic learning assistant for protein structure and function , 1991 .

[16]  M. Sippl,et al.  Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a data base of known protein conformations , 1992, Proteins.

[17]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[18]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[19]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[20]  G. Crippen,et al.  Contact potential that recognizes the correct folding of globular proteins. , 1992, Journal of molecular biology.

[21]  A. Godzik,et al.  Topology fingerprint approach to the inverse protein folding problem. , 1992, Journal of molecular biology.

[22]  Richard H. Lathrop,et al.  Massively Parallel Symbolic Induction of Protein Structure/Function Relationships , 1993, Machine Learning: From Theory to Applications.

[23]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[24]  Patrick Henry Winston,et al.  Integrating AI with sequence analysis , 1993 .

[25]  Collin M. Stultz,et al.  Structural analysis based on state‐space modeling , 1993, Protein science : a publication of the Protein Society.

[26]  C Sander,et al.  Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. , 1993, Journal of molecular biology.

[27]  S. Bryant,et al.  New Programs for Protein Tertiary Structure Prediction , 1993, Bio/Technology.