Homology-based fold predictions for Mycoplasma genitalium proteins.

Homology search techniques based on the iterative PSI-BLAST method in combination with various filters for low sequence complexity are applied to assign folds to all Mycoplasma genitalium proteins. The resulting procedure (implemented as a web server) is able to predict at least one domain in 37% of these proteins automatically, with an estimated accuracy higher than 98%. Taking structural features such as coiled coil or transmembrane regions aside, folds can be assigned to more than half of the globular proteins in a bacterium just by iterative sequence comparison.

[1]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[2]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[3]  Michael Y. Galperin,et al.  Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea , 1997, Molecular microbiology.

[4]  D Fischer,et al.  Assigning amino acid sequences to 3‐dimensional protein folds , 1996, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[5]  A. Lupas,et al.  Predicting coiled-coil regions in proteins. , 1997, Current opinion in structural biology.

[6]  Burkhard Rost,et al.  Sisyphus and prediction of protein structure , 1997, Comput. Appl. Biosci..

[7]  P. Bork,et al.  Predicting functions from protein sequences—where are the bottlenecks? , 1998, Nature Genetics.

[8]  H. Mewes,et al.  Protein structural classes in five complete genomes , 1997, Nature Structural Biology.

[9]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[10]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[11]  A T Brünger,et al.  Are there dominant membrane protein families with a given number of helices? , 1997, Proteins.

[12]  T. Gibson,et al.  Applying motif and profile searches. , 1996, Methods in enzymology.

[13]  D. Fischer,et al.  Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[14]  J. Wootton,et al.  Analysis of compositionally biased regions in sequence databases. , 1996, Methods in enzymology.

[15]  W R Taylor,et al.  SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.

[16]  C. Chothia,et al.  Gene duplications in H. influenzae , 1995, Nature.