Maximum-Likelihood Estimation of Site-Specific Mutation Rates in Human Mitochondrial DNA From Partial Phylogenetic Classification

The mitochondrial DNA hypervariable segment I (HVS-I) is widely used in studies of human evolutionary genetics, and therefore accurate estimates of mutation rates among nucleotide sites in this region are essential. We have developed a novel maximum-likelihood methodology for estimating site-specific mutation rates from partial phylogenetic information, such as haplogroup association. The resulting estimation problem is a generalized linear model, with a nonstandard link function. We develop inference and bias correction tools for our estimates and a hypothesis-testing approach for site independence. We demonstrate our methodology using 16,609 HVS-I samples from the Genographic Project. Our results suggest that mutation rates among nucleotide sites in HVS-I are highly variable. The 16,400–16,500 region exhibits significantly lower rates compared to other regions, suggesting potential functional constraints. Several loci identified in the literature as possible termination-associated sequences (TAS) do not yield statistically slower rates than the rest of HVS-I, casting doubt on their functional importance. Our tests do not reject the null hypothesis of independent mutation rates among nucleotide sites, supporting the use of site-independence assumption for analyzing HVS-I. Potential extensions of our methodology include its application to estimation of mutation rates in other genetic regions, like Y chromosome short tandem repeats.

[1]  O. Linton Local Regression Models , 2010 .

[2]  F. Delsuc Comparative Genomics , 2010, Lecture Notes in Computer Science.

[3]  C. Gustafsson,et al.  DNA replication and transcription in mammalian mitochondria. , 2007, Annual review of biochemistry.

[4]  R. J. Mitchell,et al.  The Genographic Project Public Participation Mitochondrial DNA Database , 2007, PLoS Genetics.

[5]  Saharon Rosset,et al.  Efficient inference on known phylogenetic trees using Poisson regression , 2007, Bioinform..

[6]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[7]  Hans-Jürgen Bandelt,et al.  Harvesting the fruit of the human mtDNA tree. , 2006, Trends in genetics : TIG.

[8]  Bruce Rannala,et al.  Inferring complex DNA substitution processes on phylogenies using uniformization and data augmentation. , 2006, Systematic biology.

[9]  Q. Kong,et al.  Estimation of Mutation Rates and Coalescence Times: Some Caveats , 2006 .

[10]  H. Bandelt,et al.  Human Mitochondrial DNA and the Evolution of Homo sapiens , 2006 .

[11]  Alexei J Drummond,et al.  Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. , 2005, Molecular biology and evolution.

[12]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[13]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[14]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[15]  David Haussler,et al.  Phylogenetic Hidden Markov Models , 2005 .

[16]  R. Sainudiin,et al.  Models of Microsatellite Evolution , 2005 .

[17]  N. Ben-Tal,et al.  Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. , 2004, Molecular biology and evolution.

[18]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[19]  A. von Haeseler,et al.  Identifying site-specific substitution rates. , 2003, Molecular biology and evolution.

[20]  H. Bandelt,et al.  The fingerprint of phantom mutations in mitochondrial DNA data. , 2002, American journal of human genetics.

[21]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[22]  L. Zhivotovsky,et al.  Estimating divergence time with the use of microsatellite genetic distances: impacts of population growth and gene flow. , 2001, Molecular biology and evolution.

[23]  L. Excoffier,et al.  Substitution rate variation among sites in mitochondrial hypervariable region I of humans and chimpanzees. , 1999, Molecular biology and evolution.

[24]  M. Roberti,et al.  Multiple protein-binding sites in the TAS-region of human and rat mitochondrial DNA. , 1998, Biochemical and biophysical research communications.

[25]  C Saccone,et al.  Mammalian mitochondrial D-loop region structural analysis: identification of new conserved sequences and their functional and evolutionary implications. , 1997, Gene.

[26]  R. Nielsen,et al.  Site-by-site estimation of the rate of substitution and the correlation of rates in mitochondrial DNA. , 1997, Systematic biology.

[27]  B. Ripley,et al.  Modern Applied Statistics with S-Plus. , 1996 .

[28]  H. Bandelt,et al.  Origin and evolution of Native American mtDNA variation: a reappraisal. , 1996, American journal of human genetics.

[29]  Z. Yang,et al.  Mixed model analysis of DNA sequence evolution. , 1995, Biometrics.

[30]  Z. Yang,et al.  A space-time process model for the evolution of DNA sequences. , 1995, Genetics.

[31]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[32]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[33]  M. Nei,et al.  Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. , 1993, Molecular biology and evolution.

[34]  D. A. Clayton,et al.  Elongation of displacement-loop strands in human and mouse mitochondrial DNA is arrested near specific template sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[35]  M. Kimura Estimation of evolutionary distances between homologous nucleotide sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[36]  S. Jeffery Evolution of Protein Molecules , 1979 .

[37]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .