On the profit of taking into account the known number of objects per class in classification methods (Corresp.)

In the classification problem for chromosomes there are N chromosomes which must be classified into k populations A_{1}, \cdots ,A_{k} having known probability distributions. It is further known that these N chromosomes have N_{i} in class A_{i}, i=1,2, \cdots ,k . This is a compound decision problem whose optimal solution gives a classification algorithm which is not currently useful in practice because of its long computation time. Two other classification methods are considered, and the results are compared. One is the method often used for the classification of the 46 human chromosomes, where the knowledge about the exact number of chromosome types is disregarded, and only the {\sl a priori} probability that a chromosome originates from population A_{i} is used. The other method permits only classifications with the correct number of objects in each class and selects from all the possible classifications that one which has the maximum likelihood function. This last method has some advantages for a small number of objects, particularly if the numbers of objects in the classes are equal.