This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at Los Alamos National Laboratory (LANL). Screening a large number of different types of molecules to isolate a few with desirable properties is essential in biomedical technology. For example, trying to find a particular gene in the Human genome could be akin to looking for a needle in a haystack Fortunately, testing of mixtures, or pools, of molecules allows the desirable ones to be identified, using a number of experiments proportional only to the logarithm of the total number of types of molecules. We show how to capitalize upon this potential by using optimized pooling schemes, or designs . We propose efficient non-adaptive pooling designs, such as “random sets” designs and modified “row and column” designs. Our results have been applied in the pooling and unique-sequence screening of clone libraries used in the Human Genome Project and in the mapping of Human chromosome 16. This required the use of liquid-transferring robots and manifolds--for the largest clone libraries. Finally, we developed an efficient technique for finding the posterior probability each molecule has the desirable property, given the pool assay results. This technique works well, in practice, even if there are substantial rates of errors in the pool assay data. Both our methods and our results are relevant to a broad spectrum of research in modern biology. Background and Research Objectives This work was motivated by the need to screen large collections of arrayed clones of DNA sequences, to identify clones containing a particular sequence. These collections are called clone libraries---existing libraries have up to lo5 clones. One motivation for such screenings was to construct clone maps, covering Human chromosomes [8]. In this setting, unique, or non-repeated, sequences form an unambiguous link between all clones which contain them [ 121. It is, however, out of the question to perform individual *Principal Investigator (E-mail: dct@lanl.gov)
[1]
D. Torney.
Mapping using unique sequences.
,
1991,
Journal of molecular biology.
[2]
Alexander Schliep,et al.
Interpretation of Pooling Experiments Using the Markov Chain Monte Carlo Method
,
1996,
J. Comput. Biol..
[3]
David C. Torney,et al.
Optimizing Nonadaptive Group Tests for Objects with Heterogeneous Priors
,
1998,
SIAM J. Appl. Math..
[4]
D. Balding,et al.
Efficient pooling designs for library screening.
,
1994,
Genomics.
[5]
D. Torney,et al.
Construction and characterization of a YAC library with a low frequency of chimeric clones from flow-sorted human chromosome 9.
,
1993,
Genomics.
[6]
E. Barillot,et al.
Theoretical analysis of library screening using a N-dimensional pooling strategy.
,
1991,
Nucleic acids research.
[7]
N. Doggett,et al.
An integrated physical map of human chromosome 16.
,
1995,
Nature.
[8]
Emanuel Knill,et al.
Non-adaptive Group Testing in the Presence of Errors
,
1998,
Discret. Appl. Math..
[9]
David C. Torney,et al.
Optimal Pooling Designs with Error Detection
,
1994,
J. Comb. Theory, Ser. A.
[10]
D J Balding,et al.
The design of pooling experiments for screening a clone map.
,
1997,
Fungal genetics and biology : FG & B.