Pooled Genomic Indexing (PGI): Mathematical Analysis and Experiment Design

Pooled Genomic Indexing (PGI) is a novel method for physical mapping of clones onto known macromolecular sequences. PGI is carried out by pooling arrayed clones, generating shotgun sequence reads from pools and by comparing the reads against a reference sequence. If two reads from two different pools match the reference sequence at a close distance, they are both assigned (deconvoluted) to the clone at the intersection of the two pools and the clone is mapped onto the region of the reference sequence between the two matches. A probabilistic model for PGI is developed, and several pooling schemes are designed and analyzed. The probabilistic model and the pooling schemes are validated in simulated experiments where 625 rat BAC clones and 207 mouse BAC clones are mapped onto homologous human sequence.

[1]  E. Barillot,et al.  Theoretical analysis of library screening using a N-dimensional pooling strategy. , 1991, Nucleic acids research.

[2]  R. Gibbs,et al.  Simultaneous shotgun sequencing of multiple cDNA clones. , 1997, DNA sequence : the journal of DNA sequencing and mapping.

[3]  R. Graham,et al.  Handbook of Combinatorics , 1995 .

[4]  Arkadii G. D'yachkov,et al.  New constructions of superimposed codes , 2000, IEEE Trans. Inf. Theory.

[5]  Hanfried Lenz,et al.  Design theory , 1985 .

[6]  Peter Volkmann,et al.  Über ein Problem von Fenyő , 1984 .

[7]  D. Balding,et al.  Efficient pooling designs for library screening. , 1994, Genomics.

[8]  Richard C. Singleton,et al.  Nonrandom binary superimposed codes , 1964, IEEE Trans. Inf. Theory.

[9]  J. Jurka Repbase update: a database and an electronic journal of repetitive elements. , 2000, Trends in genetics : TIG.

[10]  E. Lander,et al.  Genomic mapping by fingerprinting random clones: a mathematical analysis. , 1988, Genomics.

[11]  Aleksandar Milosavljevic DNA Sequence Recognition by Hybridization to Short Oligomers , 1995, J. Comput. Biol..

[12]  G. Schuler Pieces of the puzzle: expressed sequence tags and the catalog of human genes , 1997, Journal of Molecular Medicine.

[13]  R. Gibbs,et al.  A clone-array pooled shotgun strategy for sequencing large genomes. , 2001, Genome research.

[14]  K. Kinzler,et al.  Analysing uncharted transcriptomes with SAGE. , 2000, Trends in genetics : TIG.

[15]  I. Reiman Über ein Problem von K. Zarankiewicz , 1958 .

[16]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[17]  R. Gibbs,et al.  Large-scale concatenation cDNA sequencing. , 1997, Genome research.

[18]  B. Bollobás,et al.  Extremal Graph Theory , 2013 .

[19]  D. Du,et al.  Combinatorial Group Testing and Its Applications , 1993 .

[20]  Richard A. Gibbs,et al.  The Human Transcript Database: a catalogue of full length cDNA inserts , 1999, Bioinform..