One of the most important parameters in population genetics is theta = 4Ne mu where Ne is the effective population size and mu is the rate of mutation per gene per generation. We study two related problems, using the maximum likelihood method and the theory of coalescence. One problem is the potential improvement of accuracy in estimating the parameter theta over existing methods and the other is the estimation of parameter lambda which is the ratio of two theta's. The minimum variances of estimates of the parameter theta are derived under two idealized situations. These minimum variances serve as the lower bounds of the variances of all possible estimates of theta in practice. We then show that Watterson's estimate of theta based on the number of segregating sites is asymptotically an optimal estimate of theta. However, for a finite sample of sequences, substantial improvement over Watterson's estimate is possible when theta is large. The maximum likelihood estimate of lambda = theta 1/theta 2 is obtained and the properties of the estimate are discussed.
[1]
J. Felsenstein,et al.
Estimating effective population size from samples of sequences: inefficiency of pairwise and segregating sites as compared to phylogenetic estimates.
,
1992,
Genetical research.
[2]
M. Kimura.
The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations.
,
1969,
Genetics.
[3]
S. Ethier,et al.
The Infinitely-Many-Sites Model as a Measure-Valued Diffusion
,
1987
.
[4]
J. Kingman.
On the genealogy of large populations
,
1982
.
[5]
R. Griffiths,et al.
Genealogical-tree probabilities in the infinitely-many-site model
,
1989,
Journal of mathematical biology.