INFERENCE OF DIVERGENCE TIMES AS A STATISTICAL INVERSE PROBLEM

A familiar complaint about statisticians and applied mathematicians is that they are the possessors of a relatively small number of rather elegant hammers with which they roam the world seeking convenient nails to pound, or at least screws they can pretend are nails. One all too often hears tales of scholars who have begun to describe the details of their particular research problem to a statistician, only to have the statistician latch on to a few phrases early in the conversation and then glibly announce that the problem is an exemplar of a standard one in statistics that has a convenient, pre-packaged solution – preferably one that uses some voguish, recently developed technique (bootstrap, wavelets, Markov chain Monte Carlo, hidden Markov models,...) To some degree, this paper continues that fine tradition. We will observe that various facets of the inference of linguistic divergence times are indeed familiar nails to statisticians. However, we will depart from the tradition by being less than sanguine about whether statistics possesses, or can ever possess, the appropriate hammers to hit them. In particular, we find the assertion of (Forster & Toth, 2003) that

[1]  M. Swadesh Towards Greater Accuracy in Lexicostatistic Dating , 1955, International Journal of American Linguistics.

[2]  Harry Hoijer,et al.  Lexicostatistics: A Critique , 1956 .

[3]  Fred W. Householder,et al.  Validity of Glottochronology , 1964, Current Anthropology.

[4]  J. Tischler Glottochronologie und Lexikostatistik , 1973 .

[5]  M. Steel Recovering a tree from the leaf colourations it generates under a Markov model , 1994 .

[6]  Michael J. Sanderson,et al.  A Nonparametric Approach to Estimating Divergence Times in the Absence of Rate Constancy , 1997 .

[7]  Susan Holmes,et al.  Phylogenies: An Overview , 1997 .

[8]  Tandy J. Warnow,et al.  A few logs suffice to build (almost) all trees (I) , 1999, Random Struct. Algorithms.

[9]  Junhyong Kim,et al.  Tutorial on Phylogenetic Tree Estimation , 1999, ISMB 1999.

[10]  Richard A. Berk,et al.  Statistical Assumptions as Empirical Commitments , 2001 .

[11]  Tandy J. Warnow,et al.  Designing fast converging phylogenetic methods , 2001, ISMB.

[12]  M. Sanderson Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. , 2002, Molecular biology and evolution.

[13]  P. Stark Inverse problems as statistics , 2002 .

[14]  John Wakeley,et al.  Estimating Divergence Times from Molecular Data on Phylogenetic and Population Genetic Timescales , 2002 .

[15]  P. Forster,et al.  Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[16]  R. Gray,et al.  Language-tree divergence times support the Anatolian theory of Indo-European origin , 2003, Nature.