Pooled Genomic Indexing (PGI): Analysis and Design of Experiments

Pooled Genomic Indexing (PGI) is a novel method for physical mapping of clones onto known sequences. PGI is carried out by pooling arrayed clones and generating shotgun sequence reads from the pools. The shotgun sequences are compared to a reference sequence. In the simplest case, clones are placed on an array and are pooled by rows and columns. If a shotgun sequence from a row pool and another shotgun sequence from a column pool match the reference sequence at a close distance, they are both assigned to the clone at the intersection of the two pools. Accordingly, the clone is mapped onto the region of the reference sequence between the two matches. A probabilistic model for PGI is developed, and several pooling designs are described and analyzed, including transversal designs and designs from linear codes. The probabilistic model and the pooling schemes are validated in simulated experiments where 625 rat bacterial artificial chromosome (BAC) clones and 207 mouse BAC clones are mapped onto homologous human sequence.

[1]  Hong-Seog Park,et al.  Construction and Analysis of a Human-Chimpanzee Comparative Clone Map , 2002, Science.

[2]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[3]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[4]  Donald L. Kreher,et al.  Pooling, lattice square, and union jack designs , 1999 .

[5]  I. Reiman Über ein Problem von K. Zarankiewicz , 1958 .

[6]  R. Gibbs,et al.  Simultaneous shotgun sequencing of multiple cDNA clones. , 1997, DNA sequence : the journal of DNA sequencing and mapping.

[7]  D. Du,et al.  Combinatorial Group Testing and Its Applications , 1993 .

[8]  A. Macula Probabilistic nonadaptive group testing in the presence of errors and DNA library screening , 1999 .

[9]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[10]  D. Balding,et al.  Efficient pooling designs for library screening. , 1994, Genomics.

[11]  G. Schuler Pieces of the puzzle: expressed sequence tags and the catalog of human genes , 1997, Journal of Molecular Medicine.

[12]  Miklós Ruszinkó,et al.  On the Upper Bound of the Size of the R-Cover-Free Families , 1993, Proceedings. IEEE International Symposium on Information Theory.

[13]  J. Craig Venter,et al.  A new strategy for genome sequencing , 1996, Nature.

[14]  R. Gibbs,et al.  A clone-array pooled shotgun strategy for sequencing large genomes. , 2001, Genome research.

[15]  K. Kinzler,et al.  Analysing uncharted transcriptomes with SAGE. , 2000, Trends in genetics : TIG.

[16]  Eric D Green,et al.  Parallel construction of orthologous sequence-ready clone contig maps in multiple species. , 2002, Genome research.

[17]  E. Barillot,et al.  Theoretical analysis of library screening using a N-dimensional pooling strategy. , 1991, Nucleic acids research.

[18]  O. Antoine,et al.  Theory of Error-correcting Codes , 2022 .

[19]  Miklós Csürös,et al.  Single user tracing superimposed codes , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[20]  Emanuel Knill,et al.  A Comparative Survey of Non-Adaptive Pooling Designs , 1996 .

[21]  Richard A. Gibbs,et al.  The Human Transcript Database: a catalogue of full length cDNA inserts , 1999, Bioinform..

[22]  P. Erdös,et al.  Families of finite sets in which no set is covered by the union ofr others , 1985 .

[23]  Richard C. Singleton,et al.  Nonrandom binary superimposed codes , 1964, IEEE Trans. Inf. Theory.

[24]  R. Gibbs,et al.  Large-scale concatenation cDNA sequencing. , 1997, Genome research.

[25]  Ding-Zhu Du,et al.  New constructions of non-adaptive and error-tolerance pooling designs , 2002, Discret. Math..

[26]  R. Britten,et al.  Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[27]  J. Jurka Repbase update: a database and an electronic journal of repetitive elements. , 2000, Trends in genetics : TIG.