BMC Bioinformatics Methodology article Comparison of mode estimation methods and application in molecular clock analysis

BackgroundDistributions of time estimates in molecular clock studies are sometimes skewed or contain outliers. In those cases, the mode is a better estimator of the overall time of divergence than the mean or median. However, different methods are available for estimating the mode. We compared these methods in simulations to determine their strengths and weaknesses and further assessed their performance when applied to real data sets from a molecular clock study.ResultsWe found that the half-range mode and robust parametric mode methods have a lower bias than other mode methods under a diversity of conditions. However, the half-range mode suffers from a relatively high variance and the robust parametric mode is more susceptible to bias by outliers. We determined that bootstrapping reduces the variance of both mode estimators. Application of the different methods to real data sets yielded results that were concordant with the simulations.ConclusionBecause the half-range mode is a simple and fast method, and produced less bias overall in our simulations, we recommend the bootstrapped version of it as a general-purpose mode estimator and suggest a bootstrap method for obtaining the standard error and 95% confidence interval of the mode.

[1]  Kazutaro Yasukawa ON THE PROBABLE ERROR OF THE MODE OF SKEW FREQUENCY DISTRIBUTIONS , 1926 .

[2]  T. Dalenius The Mode—A Neglected Statistical Parameter , 1965 .

[3]  U. Grenander Some Direct Estimates of the Mode , 1965 .

[4]  F. James Rohlf,et al.  Biometry: The Principles and Practice of Statistics in Biological Research , 1969 .

[5]  Sokal Rr,et al.  Biometry: the principles and practice of statistics in biological research 2nd edition. , 1981 .

[6]  P. Stetson DAOPHOT: A COMPUTER PROGRAM FOR CROWDED-FIELD STELLAR PHOTOMETRY , 1987 .

[7]  S. Hedges The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies. , 1992, Molecular biology and evolution.

[8]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[9]  J. Schall,et al.  Neural Control of Voluntary Movement Initiation , 1996, Science.

[10]  H. Markov,et al.  An algorithm to “clean" close stellar companions , 1997 .

[11]  Huberman,et al.  Strong regularities in world wide web surfing , 1998, Science.

[12]  N. M. Brooke,et al.  A molecular timescale for vertebrate evolution , 1998, Nature.

[13]  Sudhir Kumar,et al.  Divergence time estimates for the early history of animal phyla and the origin of plants, animals and fungi , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[14]  G. Glazko,et al.  Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  J. Lunine,et al.  Molecular Evidence for the Early Colonization of Land by Fungi and Plants , 2001 .

[16]  David R. Bickel,et al.  Robust Estimators of the Mode and Skewness of Continuous Data , 2002 .

[17]  S. Hedges,et al.  Molecular Evidence for the Early Colonization of Land by Fungi and Plants , 2001, Science.

[18]  D. Bickel Robust and efficient estimation of the mode of continuous data: the mode as a viable measure of central tendency , 2003 .

[19]  Sudhir Kumar,et al.  Genomic clocks and evolutionary timescales. , 2003, Trends in genetics : TIG.