Proportion estimation with confidence limits.

A common task in microbiology involves determining the composition of a mixed population of individuals by drawing a sample from the population and using some procedure to identify the individuals in the sample. There may be a significant probability that the identification procedure misidentifies some members of the sample (for example, because the available data are insufficient unambiguously to identify an individual) which makes finding the proportions in the underlying population non-trivial. A further complication arises where individuals are present in the population that do not belong to any of the subpopulations recognised by use of the identification procedure. A simple algorithm is presented to address these problems and construct a maximum likelihood estimate of the proportions, together with confidence limits. The technique is illustrated using an example drawn from flow cytometry in which phytoplankton cells are identified from flow cytometry data by an RBF neural network, and the limitations of the approach are discussed.