Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today

In 1924 Yule observed that distributions of number of species per genus were typically long­tailed, and proposed a stochastic model to fit these data. Modern taxonomists often prefer to represent relationships between species via phylogenetic trees; the counterpart to Yule’s observation is that actual reconstructed trees look surprisingly unbalanced. The imbalance can readily be seen via a scatter diagram of the sizes of clades involved in the splits of published large phylogenetic trees. Attempting stochastic modeling leads to two puzzles. First, two somewhat opposite possible biological descriptions of what dominates the macroevolutionary process (adaptive radiation; “neutral” evolution) lead to exactly the same mathematical model (Markov or Yule or coalescent). Second, neither this nor any other simple stochastic model predicts the observed pattern of imbalance. This essay represents a probabilist’s musings on these puzzles, complementing the more detailed survey of biological literature by Mooers and Heard, Quart. Rev. Biol. 72 [(1997) 31–54].

[1]  G. Yule,et al.  A Mathematical Theory of Evolution, Based on the Conclusions of Dr. J. C. Willis, F.R.S. , 1925 .

[2]  J. Doob Stochastic processes , 1953 .

[3]  T. E. Harris,et al.  The Theory of Branching Processes. , 1963 .

[4]  Samuel Karlin,et al.  A First Course on Stochastic Processes , 1968 .

[5]  W. Ewens The sampling theory of selectively neutral alleles. , 1972, Theoretical population biology.

[6]  P. Holgate,et al.  Branching Processes with Biological Applications , 1977 .

[7]  Stephen Jay Gould,et al.  Ever Since Darwin: Reflections in Natural History , 1977 .

[8]  Stephen Jay Gould,et al.  The shape of evolution: a comparison of real and random clades , 1977, Paleobiology.

[9]  W. Ewens Mathematical Population Genetics , 1980 .

[10]  J. Kingman,et al.  Mathematics of genetic diversity , 1982 .

[11]  Alan F. Karr,et al.  Natural clades differ from “random” clades: simulations and analyses , 1981, Paleobiology.

[12]  Niles Eldredge,et al.  Phylogenetic Patterns and the Evolutionary Process. , 1981 .

[13]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[14]  Modelling the evolution of the number of genera in animal groups (Yule's problem revisited) , 1983 .

[15]  H. M. Savage The shape of evolution: systematic tree topology , 1983 .

[16]  S. Tavaré,et al.  Line-of-descent and genealogical processes, and their applications in population genetics models. , 1984, Theoretical population biology.

[17]  S. Gould Wonderful Life: The Burgess Shale and the Nature of History , 1989 .

[18]  Joseph B. Slowinski,et al.  Testing the Stochasticity of Patterns of Organismal Diversity: An Improved Null Model , 1989, The American Naturalist.

[19]  W. Ewens Population Genetics Theory - The Past and the Future , 1990 .

[20]  C. Guyer,et al.  COMPARISONS OF OBSERVED PHYLOGENETIC TOPOLOGIES WITH NULL EXPECTATIONS AMONG THREE MONOPHYLETIC LINEAGES , 1991, Evolution; international journal of organic evolution.

[21]  W. Norton,et al.  Extinction: bad genes or bad luck? , 1991, New scientist.

[22]  M. Slatkin,et al.  SEARCHING FOR EVOLUTIONARY PATTERNS IN THE SHAPE OF A PHYLOGENETIC TREE , 1993, Evolution; international journal of organic evolution.

[23]  C. Guyer,et al.  ADAPTIVE RADIATION AND THE TOPOLOGY OF LARGE PHYLOGENIES , 1993, Evolution; international journal of organic evolution.

[24]  James F. Smith Phylogenetics of seed plants : An analysis of nucleotide sequences from the plastid gene rbcL , 1993 .

[25]  Leo Breiman The 1991 Census Adjustment: Undercount or Bad Data? , 1994 .

[26]  Gregory F. Lawler Introduction to Stochastic Processes , 1995 .

[27]  D. Aldous Darwin's log: a toy model of speciation and extinction , 1995, Journal of Applied Probability.

[28]  D. Aldous PROBABILITY DISTRIBUTIONS ON CLADOGRAMS , 1996 .

[29]  James S. Rogers,et al.  CENTRAL MOMENTS AND PROBABILITY DISTRIBUTIONS OF THREE MEASURES OF PHYLOGENETIC TREE IMBALANCE , 1996 .

[30]  S. Heard,et al.  PATTERNS IN PHYLOGENETIC TREE BALANCE WITH VARIABLE AND EVOLVING SPECIATION RATES , 1996, Evolution; international journal of organic evolution.

[31]  Arne Ø. Mooers,et al.  Inferring Evolutionary Process from Phylogenetic Tree Shape , 1997, The Quarterly Review of Biology.

[32]  Susan Holmes,et al.  Phylogenies: An Overview , 1997 .

[33]  Interpreting sister-group tests of key innovation hypotheses. , 1998 .

[34]  What remains to be discovered , 1998 .

[35]  E. Paradis Detecting Shifts in Diversification Rates without Fossils , 1998, The American Naturalist.