Estimation of evolutionary distances under stationary and nonstationary models of nucleotide substitution.

Estimation of evolutionary distances has always been a major issue in the study of molecular evolution because evolutionary distances are required for estimating the rate of evolution in a gene, the divergence dates between genes or organisms, and the relationships among genes or organisms. Other closely related issues are the estimation of the pattern of nucleotide substitution, the estimation of the degree of rate variation among sites in a DNA sequence, and statistical testing of the molecular clock hypothesis. Mathematical treatments of these problems are considerably simplified by the assumption of a stationary process in which the nucleotide compositions of the sequences under study have remained approximately constant over time, and there now exist fairly extensive studies of stationary models of nucleotide substitution, although some problems remain to be solved. Nonstationary models are much more complex, but significant progress has been recently made by the development of the paralinear and LogDet distances. This paper reviews recent studies on the above issues and reports results on correcting the estimation bias of evolutionary distances, the estimation of the pattern of nucleotide substitution, and the estimation of rate variation among the sites in a sequence.

[1]  Thomas Uzzell,et al.  Fitting Discrete Probability Distributions to Evolutionary Events , 1971, Science.

[2]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[3]  M. Nei,et al.  Estimation of evolutionary distance between nucleotide sequences. , 1984, Molecular biology and evolution.

[4]  W. Li,et al.  Evidence for higher rates of nucleotide substitution in rodents than in man. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[5]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[6]  J. Hartigan,et al.  Asynchronous distance between homologous DNA sequences. , 1987, Biometrics.

[7]  J. Felsenstein,et al.  Invariants of phylogenies in a simple case with discrete states , 1987 .

[8]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[9]  C Saccone,et al.  Influence of base composition on quantitative estimates of gene evolution. , 1990, Methods in enzymology.

[10]  J. Oliver,et al.  The general stochastic model of nucleotide substitution. , 1990, Journal of theoretical biology.

[11]  B S Weir,et al.  Testing for equality of evolutionary rates. , 1992, Genetics.

[12]  Peng Li,et al.  Relative-Rate Test for Nucleotide Substitutions between Two Lineages , 1992 .

[13]  F. Tajima,et al.  Simple methods for testing the molecular evolutionary clock hypothesis. , 1993, Genetics.

[14]  M. Nei,et al.  Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. , 1993, Molecular biology and evolution.

[15]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[16]  M. Sogin,et al.  Universal tree of life , 1993, Nature.

[17]  Masami Hasegawa,et al.  Ribosomal RNA trees misleading? , 1993, Nature.

[18]  M. Steel,et al.  Recovering evolutionary trees under a more realistic model of sequence evolution. , 1994, Molecular biology and evolution.

[19]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[20]  László A. Székely,et al.  Reconstructing Trees When Sequence Sites Evolve at Variable Rates , 1994, J. Comput. Biol..

[21]  A. von Haeseler,et al.  A stochastic model for the evolution of autocorrelated DNA sequences. , 1994, Molecular phylogenetics and evolution.

[22]  J. Lake,et al.  Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[23]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[24]  W. Li,et al.  Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites. , 1995, Molecular biology and evolution.

[25]  M. Gouy,et al.  Inferring phylogenies from DNA sequences of unequal base compositions. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[26]  K. Holsinger,et al.  Among-site rate variation and phylogenetic analysis of 12S rRNA in sigmodontine rodents. , 1995, Molecular biology and evolution.

[27]  A Rzhetsky,et al.  Phylogenetic test of the molecular clock and linearized trees. , 1995, Molecular biology and evolution.

[28]  A. Rzhetsky Estimating substitution rates in ribosomal RNA genes. , 1995, Genetics.

[29]  W. Li,et al.  A general additive distance with time-reversibility and rate variation among nucleotide sites. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[30]  W. Li,et al.  Estimating evolutionary distances between DNA sequences. , 1996 .

[31]  R. Doolittle,et al.  Determining Divergence Times of the Major Kingdoms of Living Organisms with a Protein Clock , 1996, Science.

[32]  M. Nei,et al.  Phylogenetic analysis in molecular evolutionary genetics. , 1996, Annual review of genetics.

[33]  J. Rice,et al.  Modeling nucleotide evolution: a heterogeneous rate analysis. , 1996, Mathematical biosciences.

[34]  Z. Yang,et al.  Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites. , 1996, Molecular biology and evolution.

[35]  M. Miyamoto,et al.  Constraints on protein evolution and the age of the eubacteria/eukaryote split. , 1996, Systematic biology.

[36]  W. Li,et al.  Bias-corrected paralinear and LogDet distances and tests of molecular clocks and phylogenies under nonstationary nucleotide frequencies. , 1996, Molecular biology and evolution.

[37]  J. Zhang,et al.  A simple method for estimating the parameter of substitution rate variation among sites. , 1997, Molecular biology and evolution.

[38]  D Penny,et al.  Hadamard conjugations and modeling sequence evolution with unequal rates across sites. , 1997, Molecular phylogenetics and evolution.

[39]  X. Gu The age of the common ancestor of eukaryotes and prokaryotes: statistical inferences. , 1997, Molecular biology and evolution.

[40]  M. Gouy,et al.  Evolutionary distances between nucleotide sequences based on the distribution of substitution rates among sites as estimated by parsimony. , 1997, Molecular biology and evolution.