Optimal screening designs for biomedical technology

This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at Los Alamos National Laboratory (LANL). Screening a large number of different types of molecules to isolate a few with desirable properties is essential in biomedical technology. For example, trying to find a particular gene in the Human genome could be akin to looking for a needle in a haystack Fortunately, testing of mixtures, or pools, of molecules allows the desirable ones to be identified, using a number of experiments proportional only to the logarithm of the total number of types of molecules. We show how to capitalize upon this potential by using optimized pooling schemes, or designs . We propose efficient non-adaptive pooling designs, such as “random sets” designs and modified “row and column” designs. Our results have been applied in the pooling and unique-sequence screening of clone libraries used in the Human Genome Project and in the mapping of Human chromosome 16. This required the use of liquid-transferring robots and manifolds--for the largest clone libraries. Finally, we developed an efficient technique for finding the posterior probability each molecule has the desirable property, given the pool assay results. This technique works well, in practice, even if there are substantial rates of errors in the pool assay data. Both our methods and our results are relevant to a broad spectrum of research in modern biology. Background and Research Objectives This work was motivated by the need to screen large collections of arrayed clones of DNA sequences, to identify clones containing a particular sequence. These collections are called clone libraries---existing libraries have up to lo5 clones. One motivation for such screenings was to construct clone maps, covering Human chromosomes [8]. In this setting, unique, or non-repeated, sequences form an unambiguous link between all clones which contain them [ 121. It is, however, out of the question to perform individual *Principal Investigator (E-mail: dct@lanl.gov)