Transcript mapping with oligonucleotide high-density tiling arrays

Motivation: High-density DNA tiling microarrays are a powerful tool for the characterization of complete transcriptomes. The two major analytical challenges are the segmentation of the hybridization signal along genomic coordinates to accurately determine transcript boundaries and the adjustment of the sequence-dependent response of the oligonucleotide probes to achieve quantitative comparability of the signal between different probes. Results: We describe a dynamic programming algorithm for finding a globally optimal fit of a piecewise constant expression profile along genomic coordinates. We developed a probe-specific background correction and scaling method that employs empirical probe response parameters determined from reference hybridizations with no need for paired mismatch (MM) probes. This combined analysis approach allows the accurate determination of dynamical changes in transcription architectures from hybridization data and will help to study the biological significance of complex transcriptional phenomena in eukaryotic genomes. Availability: R package tilingArray at www.bioconductor.org.

[1]  R. Stoughton,et al.  Experimental annotation of the human genome using microarray technology , 2001, Nature.

[2]  S. P. Fodor,et al.  Large-Scale Transcriptional Activity in Chromosomes 21 and 22 , 2002, Science.

[3]  Franck Picard,et al.  A statistical approach for array CGH data analysis , 2005, BMC Bioinformatics.

[4]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[5]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[6]  Martin Vingron,et al.  Error models for microarray intensities , 2004 .

[7]  Clifford A. Meyer,et al.  Chromosome-Wide Mapping of Estrogen Receptor Binding Reveals Long-Range Regulation Requiring the Forkhead Protein FoxA1 , 2005, Cell.

[8]  David A. Hume,et al.  Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics , 2004 .

[9]  David M. Rocke,et al.  A Model for Measurement Error for Gene Expression Arrays , 2001, J. Comput. Biol..

[10]  R. Durbin,et al.  Biological sequence analysis: Background on probability , 1998 .

[11]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[12]  Rafael A. Irizarry,et al.  A Model-Based Background Adjustment for Oligonucleotide Expression Arrays , 2004 .

[13]  Vladimir Svetnik,et al.  A comprehensive transcript index of the human genome generated using microarrays and computational approaches , 2004, Genome Biology.

[14]  Douglas M. Hawkins,et al.  A variance-stabilizing transformation for gene-expression microarray data , 2002, ISMB.

[15]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[16]  Hongyu Zhao,et al.  Protein–DNA interaction mapping using genomic tiling path microarrays in Drosophila , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Achim Zeileis,et al.  Validating multiple structural change models : A case study , 2005 .

[18]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[19]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[20]  P. Perron,et al.  Estimating and testing linear models with multiple structural changes , 1995 .

[21]  G. Church,et al.  RNA expression analysis using a 30 base pair resolution Escherichia coli genome array , 2000, Nature Biotechnology.

[22]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Felix Naef,et al.  Solving the riddle of the bright mismatches: labeling and effective binding in oligonucleotide arrays. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Joseph M. Dale,et al.  Empirical Analysis of Transcriptional Activity in the Arabidopsis Genome , 2003, Science.

[25]  Wolfgang Huber,et al.  A high-resolution map of transcription in the yeast genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Thomas E. Royce,et al.  Global Identification of Human Transcribed Sequences with Genome Tiling Arrays , 2004, Science.

[27]  C. Chin,et al.  Global identification of noncoding RNAs in Saccharomyces cerevisiae by modulating an essential RNA processing pathway. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[29]  S. Cawley,et al.  Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. , 2004, Genome research.

[30]  Vincent Colot,et al.  Profiling histone modification patterns in plants using genomic tiling microarrays , 2005, Nature Methods.

[31]  David R. Haynor,et al.  Identifying operons and untranslated regions of transcripts using Escherichia coli RNA expression analysis , 2002, ISMB.

[32]  Scott A. Rifkin,et al.  A Gene Expression Map for the Euchromatic Genome of Drosophila melanogaster , 2004, Science.

[33]  E. Schadt,et al.  Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments. , 2005, Trends in genetics : TIG.

[34]  Joseph R. Ecker,et al.  Corrigendum to ‘‘Applications of DNA tiling arrays for whole-genome analysis’’ [Genomics 85 (2005) 1–15] , 2005 .

[35]  Kurt Hornik,et al.  Testing and dating of structural changes in practice , 2003, Comput. Stat. Data Anal..

[36]  P. Perron,et al.  Computation and Analysis of Multiple Structural-Change Models , 1998 .

[37]  Achim Zeileis,et al.  Strucchange: An R package for testing for structural change in linear regression models , 2002 .

[38]  Mark Gerstein,et al.  Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping. , 2005, Trends in genetics : TIG.

[39]  Maitreya J. Dunham,et al.  Genome-Wide Detection of Polymorphisms at Nucleotide Resolution with a Single DNA Microarray , 2006, Science.