Assessing the performance of different high-density tiling microarray strategies for mapping transcribed regions of the human genome.

Genomic tiling microarrays have become a popular tool for interrogating the transcriptional activity of large regions of the genome in an unbiased fashion. There are several key parameters associated with each tiling experiment (e.g., experimental protocols and genomic tiling density). Here, we assess the role of these parameters as they are manifest in different tiling-array platforms used for transcription mapping. First, we analyze how a number of published tiling-array experiments agree with established gene annotation on human chromosome 22. We observe that the transcription detected from high-density arrays correlates substantially better with annotation than that from other array types. Next, we analyze the transcription-mapping performance of the two main high-density oligonucleotide array platforms in the ENCODE regions of the human genome. We hybridize identical biological samples and develop several ways of scoring the arrays and segmenting the genome into transcribed and nontranscribed regions, with the aim of making the platforms most comparable to each other. Finally, we develop a platform comparison approach based on agreement with known annotation. Overall, we find that the performance improves with more data points per locus, coupled with statistical scoring approaches that properly take advantage of this, where this larger number of data points arises from higher genomic tiling density and the use of replicate arrays and mismatches. While we do find significant differences in the performance of the two high-density platforms, we also find that they complement each other to some extent. Finally, our experiments reveal a significant amount of novel transcription outside of known genes, and an appreciable sample of this was validated by independent experiments.

[1]  I. Nazarenko,et al.  Effect of primary and secondary structure of oligodeoxyribonucleotides on the fluorescent properties of conjugated dyes. , 2002, Nucleic acids research.

[2]  James G. R. Gilbert,et al.  The vertebrate genome annotation (Vega) database , 2004, Nucleic Acids Res..

[3]  S. P. Fodor,et al.  Large-Scale Transcriptional Activity in Chromosomes 21 and 22 , 2002, Science.

[4]  M. Zuker,et al.  OligoArray 2.0: design of oligonucleotide probes for DNA microarrays using a thermodynamic approach. , 2003, Nucleic acids research.

[5]  M. Sussman,et al.  Maskless fabrication of light-directed oligonucleotide microarrays using a digital micromirror array , 1999, Nature Biotechnology.

[6]  T. Richmond,et al.  Light-directed 5'-->3' synthesis of complex oligonucleotide microarrays. , 2003, Nucleic acids research.

[7]  Clifford A. Meyer,et al.  A hidden Markov model for analyzing ChIP-chip experiments on genome tiling arrays and its application to p53 binding sequences , 2005, ISMB.

[8]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[9]  J. Rinn,et al.  The transcriptional activity of human Chromosome 22. , 2003, Genes & development.

[10]  Jun Wang,et al.  Tiling microarray analysis of rice chromosome 10 to identify the transcriptome and relate its expression to chromosomal architecture , 2005, Genome Biology.

[11]  Yudong D. He,et al.  Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer , 2001, Nature Biotechnology.

[12]  Wing Hung Wong,et al.  TileMap: create chromosomal map of tiling array hybridizations , 2005, Bioinform..

[13]  Erez Y. Levanon,et al.  Widespread occurrence of antisense transcription in the human genome , 2003, Nature Biotechnology.

[14]  M. Brent,et al.  Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[15]  G. Helt,et al.  Transcriptional Maps of 10 Human Chromosomes at 5-Nucleotide Resolution , 2005, Science.

[16]  Petri Auvinen,et al.  Are data from different gene expression microarray platforms comparable? , 2004, Genomics.

[17]  Mark Gerstein,et al.  Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping. , 2005, Trends in genetics : TIG.

[18]  Franco Cerrina,et al.  Gene expression analysis using oligonucleotide arrays produced by maskless photolithography. , 2002, Genome research.

[19]  Michael Snyder,et al.  Extrapolating traditional DNA microarray statistics to tiling and protein microarray technologies. , 2006, Methods in enzymology.

[20]  D. Turner,et al.  Predicting oligonucleotide affinity to nucleic acid targets. , 1999, RNA.

[21]  S. Cawley,et al.  Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAs , 2004, Cell.

[22]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[23]  S. Batalov,et al.  Antisense Transcription in the Mammalian Transcriptome , 2005, Science.

[24]  Xiaoqiu Huang,et al.  Over 20% of human transcripts might form sense-antisense pairs. , 2004, Nucleic acids research.

[25]  Hans Lehrach,et al.  A comparison of oligonucleotide and cDNA-based microarray systems. , 2004, Physiological genomics.

[26]  M. Gerstein,et al.  Design optimization methods for genomic DNA tiling arrays. , 2005, Genome research.

[27]  J. Eberwine,et al.  Amplified RNA synthesized from limited quantities of heterogeneous cDNA. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[28]  R. Lempicki,et al.  Evaluation of gene expression measurements from commercial microarray platforms. , 2003, Nucleic acids research.

[29]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[30]  C. McKean Figures , 1970, Five Long Winters.

[31]  S. P. Fodor,et al.  High density synthetic oligonucleotide arrays , 1999, Nature Genetics.

[32]  Scott A. Rifkin,et al.  A Gene Expression Map for the Euchromatic Genome of Drosophila melanogaster , 2004, Science.

[33]  Bryan Frank,et al.  Independence and reproducibility across microarray platforms , 2005, Nature Methods.

[34]  Joseph M. Dale,et al.  Empirical Analysis of Transcriptional Activity in the Arabidopsis Genome , 2003, Science.

[35]  J. SantaLucia,et al.  The thermodynamics of DNA structural motifs. , 2004, Annual review of biophysics and biomolecular structure.

[36]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[37]  Chuong B. Do,et al.  Access the most recent version at doi: 10.1101/gr.926603 References , 2003 .

[38]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[39]  Sangdun Choi,et al.  Current issues for DNA microarrays: platform comparison, double linear amplification, and universal RNA reference. , 2004, Journal of biotechnology.

[40]  David B. Allison,et al.  A proposed metric for assessing the measurement quality of individual microarrays , 2006, BMC Bioinformatics.

[41]  Philipp Kapranov,et al.  Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. , 2005, Genome research.

[42]  Carole L Yauk,et al.  Comprehensive comparison of six microarray technologies. , 2004, Nucleic acids research.

[43]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[44]  Thomas E. Royce,et al.  Global Identification of Human Transcribed Sequences with Genome Tiling Arrays , 2004, Science.

[45]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[46]  S. Cawley,et al.  Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. , 2004, Genome research.

[47]  Mark Gerstein,et al.  DNA replication-timing analysis of human chromosome 22 at high resolution and different developmental states. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Michael Snyder,et al.  ChIP-chip: a genomic approach for identifying transcription factor binding sites. , 2002, Methods in enzymology.

[49]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[50]  Vladimir Svetnik,et al.  A comprehensive transcript index of the human genome generated using microarrays and computational approaches , 2004, Genome Biology.

[51]  R. Stoughton,et al.  Experimental annotation of the human genome using microarray technology , 2001, Nature.