Sample size for a phylogenetic inference.

The objective of this work is to describe sample-size calculations for the inference of a nonzero central branch length in an unrooted four-species phylogeny. Attention is restricted to independent binary characters, such as might be obtained from an alignment of the purine-pyrimidine sequences of a nucleic acid molecule. A statistical test based on a multinomial model for character-state configurations is described. The importance of including invariable sites in models for sequence change is demonstrated, and their effect on sample size is quantified. The methods are applied to a four-species alignment of small-subunit rRNA sequences derived from two archaebacteria, a eubacteria and a eukaryote. We conclude that the information in these sequences is not sufficient to resolve the branching order of this tree. Estimates of the number of aligned nucleotide positions required to provide a reasonably powerful test are given.

[1]  J A Lake,et al.  A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. , 1987, Molecular biology and evolution.

[2]  Masami Hasegawa,et al.  CONFIDENCE LIMITS ON THE MAXIMUM‐LIKELIHOOD ESTIMATE OF THE HOMINOID TREE FROM MITOCHONDRIAL‐DNA SEQUENCES , 1989, Evolution; international journal of organic evolution.

[3]  James A. Lake,et al.  Origin of the eukaryotic nucleus determined by rate-invariant analysis of rRNA sequences , 1988, Nature.

[4]  M. Nei Molecular Evolutionary Genetics , 1987 .

[5]  J. Lake Origin of the Metazoa. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Charles B. Hallahan,et al.  An Effective Algorithm for the Noncentral Chi-Squared Distribution Function , 1989 .

[7]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[8]  W. Fitch,et al.  Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated. , 1989, Molecular biology and evolution.

[9]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[10]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[11]  C. Woese,et al.  Bacterial evolution , 1987, Microbiological reviews.

[12]  N. L. Johnson,et al.  Linear Statistical Inference and Its Applications , 1966 .

[13]  J. A. Cavender Taxonomy with confidence , 1978 .

[14]  G A Churchill,et al.  Methods for inferring phylogenies from nucleic acid sequence data by using maximum likelihood and linear invariants. , 1991, Molecular biology and evolution.