xMAN: extreme MApping of OligoNucleotides

BackgroundThe ability to rapidly map millions of oligonucleotide fragments to a reference genome is crucial to many high throughput genomic technologies.ResultsWe propose an intuitive and efficient algorithm, titled ex treme MA pping of OligoN ucleotide (xMAN), to rapidly map millions of oligonucleotide fragments to a genome of any length. By converting oligonucleotides to integers hashed in RAM, xMAN can scan through genomes using bit shifting operation and achieve at least one order of magnitude speed increase over existing tools. xMAN can map the 42 million 25-mer probes on the Affymetrix whole human genome tiling arrays to the entire genome in less than 6 CPU hours.ConclusionsIn addition to the speed advantage, we found the probe mapping of xMAN to substantially improve the final analysis results in both a spike-in experiment on ENCODE tiling arrays and an estrogen receptor ChIP-chip experiment on whole human genome tiling arrays. Those improvements were confirmed by direct ChIP and real-time PCR assay. xMAN can be further extended for application to other high-throughput genomic technologies for oligonucleotide mapping.

[1]  Wolfgang Huber,et al.  A high-resolution map of transcription in the yeast genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[2]  G. Helt,et al.  Transcriptional Maps of 10 Human Chromosomes at 5-Nucleotide Resolution , 2005, Science.

[3]  Joseph R. Ecker,et al.  Corrigendum to ‘‘Applications of DNA tiling arrays for whole-genome analysis’’ [Genomics 85 (2005) 1–15] , 2005 .

[4]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[5]  Sandrine Dudoit,et al.  Multiple Testing Methods For ChIP - Chip High Density Oligonucleotide Array Data , 2006, J. Comput. Biol..

[6]  B. Trask,et al.  Segmental duplications: organization and impact within the current human genome project assembly. , 2001, Genome research.

[7]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[8]  Wing Hung Wong,et al.  TileMap: create chromosomal map of tiling array hybridizations , 2005, Bioinform..

[9]  Z. Weng,et al.  A Global Map of p53 Transcription-Factor Binding Sites in the Human Genome , 2006, Cell.

[10]  Clifford A. Meyer,et al.  A hidden Markov model for analyzing ChIP-chip experiments on genome tiling arrays and its application to p53 binding sequences , 2005, ISMB.

[11]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[12]  S. Cawley,et al.  Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAs , 2004, Cell.

[13]  Clifford A. Meyer,et al.  Genome-wide analysis of estrogen receptor binding sites , 2006, Nature Genetics.

[14]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[15]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[16]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[17]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[18]  Clifford A. Meyer,et al.  Model-based analysis of tiling-arrays for ChIP-chip , 2006, Proceedings of the National Academy of Sciences.

[19]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[20]  Maqc Consortium The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements , 2006, Nature Biotechnology.