Transcript mapping with high-density oligonucleotide tiling arrays

MOTIVATION High-density DNA tiling microarrays are a powerful tool for the characterization of complete transcriptomes. The two major analytical challenges are the segmentation of the hybridization signal along genomic coordinates to accurately determine transcript boundaries and the adjustment of the sequence-dependent response of the oligonucleotide probes to achieve quantitative comparability of the signal between different probes. RESULTS We describe a dynamic programming algorithm for finding a globally optimal fit of a piecewise constant expression profile along genomic coordinates. We developed a probe-specific background correction and scaling method that employs empirical probe response parameters determined from reference hybridizations with no need for paired mismatch probes. This combined analysis approach allows the accurate determination of dynamical changes in transcription architectures from hybridization data and will help to study the biological significance of complex transcriptional phenomena in eukaryotic genomes. AVAILABILITY R package tilingArray at http://www.bioconductor.org.

[1]  S. P. Fodor,et al.  Large-Scale Transcriptional Activity in Chromosomes 21 and 22 , 2002, Science.

[2]  David R. Haynor,et al.  Identifying operons and untranslated regions of transcripts using Escherichia coli RNA expression analysis , 2002, ISMB.

[3]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[4]  K. Aldape,et al.  A model of molecular interactions on short oligonucleotide microarrays , 2003, Nature Biotechnology.

[5]  Achim Zeileis,et al.  Validating multiple structural change models : A case study , 2005 .

[6]  Vincent Colot,et al.  Profiling histone modification patterns in plants using genomic tiling microarrays , 2005, Nature Methods.

[7]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8]  Gilles Celeux,et al.  A statistical approach for CGH microarray data analysis , 2004 .

[9]  Maitreya J. Dunham,et al.  Genome-Wide Detection of Polymorphisms at Nucleotide Resolution with a Single DNA Microarray , 2006, Science.

[10]  Kurt Hornik,et al.  Testing and dating of structural changes in practice , 2003, Comput. Stat. Data Anal..

[11]  Rafael A. Irizarry,et al.  A Model-Based Background Adjustment for Oligonucleotide Expression Arrays , 2004 .

[12]  R. Stoughton,et al.  Experimental annotation of the human genome using microarray technology , 2001, Nature.

[13]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[14]  David M. Rocke,et al.  A Model for Measurement Error for Gene Expression Arrays , 2001, J. Comput. Biol..

[15]  Mark Gerstein,et al.  Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping. , 2005, Trends in genetics : TIG.

[16]  Vladimir Svetnik,et al.  A comprehensive transcript index of the human genome generated using microarrays and computational approaches , 2004, Genome Biology.

[17]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[18]  Scott A. Rifkin,et al.  A Gene Expression Map for the Euchromatic Genome of Drosophila melanogaster , 2004, Science.

[19]  Joseph M. Dale,et al.  Empirical Analysis of Transcriptional Activity in the Arabidopsis Genome , 2003, Science.

[20]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[21]  S. Cawley,et al.  Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. , 2004, Genome research.

[22]  C. Mathew Encyclopedia of genetics, genomics, proteomics and bioinformatics. , 2005 .

[23]  C. Chin,et al.  Global identification of noncoding RNAs in Saccharomyces cerevisiae by modulating an essential RNA processing pathway. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[24]  P. Perron,et al.  Estimating and testing linear models with multiple structural changes , 1995 .

[25]  Felix Naef,et al.  Solving the riddle of the bright mismatches: labeling and effective binding in oligonucleotide arrays. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Franck Picard,et al.  A statistical approach for array CGH data analysis , 2005, BMC Bioinformatics.

[27]  Hongyu Zhao,et al.  Protein–DNA interaction mapping using genomic tiling path microarrays in Drosophila , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Achim Zeileis,et al.  Strucchange: An R package for testing for structural change in linear regression models , 2002 .

[29]  E. Schadt,et al.  Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments. , 2005, Trends in genetics : TIG.

[30]  Thomas E. Royce,et al.  Global Identification of Human Transcribed Sequences with Genome Tiling Arrays , 2004, Science.

[31]  Douglas M. Hawkins,et al.  A variance-stabilizing transformation for gene-expression microarray data , 2002, ISMB.

[32]  Clifford A. Meyer,et al.  Chromosome-Wide Mapping of Estrogen Receptor Binding Reveals Long-Range Regulation Requiring the Forkhead Protein FoxA1 , 2005, Cell.

[33]  Martin Vingron,et al.  Error models for microarray intensities , 2004 .

[34]  Andrew W. Dowsey,et al.  Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics , 2005 .

[35]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Wolfgang Huber,et al.  A high-resolution map of transcription in the yeast genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Wing Hung Wong,et al.  Model-based analysis of oligonucleotide arrays and issues in cDNA microarray analysis , 2003 .

[38]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[39]  G. Church,et al.  RNA expression analysis using a 30 base pair resolution Escherichia coli genome array , 2000, Nature Biotechnology.

[40]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[41]  P. Perron,et al.  Computation and Analysis of Multiple Structural-Change Models , 1998 .