Testing of Evolutionary Independence in Simulated Phylogenetic Trees

Astolfi, P. (Istituto di Genetica, Universita di Pavia 27100 Pavia, Italy), A. Piazza (Istituto di Genetica Medica, Universita di Torino, 10100 Torino, Italy), and K. K. Kidd (Department of Human Genetics, School of Medicine, Yale University, New Haven, Connecticut 06510 U.S.A.) 1978. Testing of evolutionary independence in simulated phylogenetic trees. Syst. Zool. 27:391-400.-We can represent with a tree structure the dispersion matrix of a group of populations defined by a set of characters. The "treeness" test, T = c log R, where R is the ratio between the determinants of the observed and the estimated expected dispersion matrix and c is the number of characters, evaluates the validity of such a representation for a specific dispersion matrix. The test is based on a model of independent evolution: from an initial population a set of independent populations originates by successive splits. The treeness tends to a x2 distribution, as the number of characters tends to infinity. We have analyzed the validity of the test: 1-with different ratios: (no. characters)/(no. populations); and 2-when conditions for independent evolution do not exist. These problems have been studied by designing two kinds of simulations. The first one generates trees through a process of independent evolution and the second one produces trees where a fraction of the final population results from the fusion of a pair of ancestor populations. In the first case we have studied 500 simulated trees with various levels of complexity. We have seen that the T distributions are actually consistent with x2 distributions when the ratio (no. characters)/(no. populations) is greater than five. In the second case we have compared the distributions obtained from trees with independent populations and those obtained from trees with hybrid populations. In this case, even when the ratio (no. characters)/(no. populations) is high, the T distributions deviate very significantly from the x2 distribution. Moreover, we have analyzed how much the time at which hybridization occurs affects the test for independent evolution. We have found that our test is relatively powerful in rejecting the hypothesis of independent evolution only when the hybridization takes place early enough, i.e., in the first half of the evolutionary time. The time intervals between two populations, estimated as the difference between the variances of the two populations, have been compared with those randomly determined by the simulation process. In the case of trees with independent populations the results confirm the expected value 1 and give evidence to the goodness of our estimation procedure; in the case of trees with hybrid populations the time intervals involved in the hybridization process are, as expected, significantly underestimated. [Phylogenetic trees; independent evolution; simulation; hybridization; human evolution.] A major problem in reconstructing human evolution in terms of phylogenetic trees is how to test whether real data are well or poorly represented by tree-like structures. Recently it has been shown (Cavalli-Sforza and Piazza, 1975; Piazza and Cavalli-Sforza, 1975, 1976) that populations which separate according to a given pattern of fission (a tree of descent) and evolve independently after each fission can be analyzed by standard multivariate techniques, and tests of hypotheses can be set up. This is possible because a model of independent evolution can be formulated by generalizing the approach used by Cavalli-Sforza and Edwards (CavalliSforza and Edwards, 1964, 1967; Edwards, 1970). A number c of characters (for instance gene frequencies, measurements on morphological traits, etc.) are supposed to evolve following a Brownian motion: the amount of change in-characters is normally distributed with mean 0 and variance proportional to the time elapsed. At a fixed time a population splits into two. At this time the two daughter populations have the same values for the characters, but from that moment on the characters change independently, and the Brownian process continues separately in each population.