Probabilistic reconstruction of ancestral protein sequences

Using a maximum-likelihood formalism, we have developed a method with which to reconstruct the sequences of ancestral proteins. Our approach allows the calculation of not only the most probable ancestral sequence but also of the probability of any amino acid at any given node in the evolutionary tree. Because we consider evolution on the amino acid level, we are better able to include effects of evolutionary pressure and take advantage of structural information about the protein through the use of mutation matrices that depend on secondary structure and surface accessibility. The computational complexity of this method scales linearly with the number of homologous proteins used to reconstruct the ancestral sequence.

[1]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[2]  M. Goodman,et al.  Maximum parsimony approach to construction of evolutionary trees from aligned homologous sequences. , 1990, Methods in enzymology.

[3]  B. Rost,et al.  Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[4]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[5]  A. Wilson,et al.  Reconstruction and testing of ancestral proteins. , 1993, Methods in enzymology.

[6]  C. Sander,et al.  Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? , 1994, Protein engineering.

[7]  R. Holmquist The method of parsimony: an experimental test and theoretical analysis of the adequacy of molecular restoration studies. , 1979, Journal of Molecular Biology.

[8]  E. Neher How frequent are correlated changes in families of protein sequences? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[9]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[10]  Rainer Fuchs,et al.  CLUSTAL V: improved software for multiple sequence alignment , 1992, Comput. Appl. Biosci..

[11]  S. Pääbo Ancient DNA: extraction, characterization, molecular cloning, and enzymatic amplification. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[12]  R. DeSalle,et al.  DNA sequences from a fossil termite in Oligo-Miocene amber and their phylogenetic implications. , 1992, Science.

[13]  S A Benner,et al.  Bona fide prediction of aspects of protein conformation. Assigning interior and surface residues from patterns of variation and conservation in homologous protein sequences. , 1994, Journal of molecular biology.

[14]  A. von Haeseler,et al.  Independent origins of New Zealand moas and kiwis. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[15]  A. Donato,et al.  Reconstruction of ancestral sequences by the inferential method, a tool for protein engineering studies , 1994, Journal of Molecular Evolution.

[16]  Brian W. Matthews,et al.  Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing , 1990, Nature.

[17]  S A Benner,et al.  Amino acid substitution during functionally constrained divergent evolution of protein sequences. , 1994, Protein engineering.

[18]  A. Wilson,et al.  DNA sequences from the quagga, an extinct member of the horse family , 1984, Nature.

[19]  M. O. Dayhoff A model of evolutionary change in protein , 1978 .

[20]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[21]  Ziheng Yang Statistical Properties of the Maximum Likelihood Method of Phylogenetic Estimation and Comparison With Distance Matrix Methods , 1994 .

[22]  G. Moore,et al.  A method for constructing maximum parsimony ancestral amino acid sequences on a given network. , 1973, Journal of theoretical biology.

[23]  Scott R. Presnell,et al.  The ribonuclease from an extinct bovid ruminant , 1990, FEBS letters.

[24]  K. Hatrick,et al.  Compensating changes in protein multiple sequence alignments. , 1994, Protein engineering.

[25]  Joseph Felsenstein,et al.  Maximum Likelihood and Minimum-Steps Methods for Estimating Evolutionary Trees from Data on Discrete Characters , 1973 .

[26]  N. Saitou,et al.  Maximum likelihood methods. , 1990, Methods in enzymology.

[27]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[28]  R A Goldstein,et al.  Context-dependent optimal substitution matrices. , 1995, Protein engineering.