TileMap: create chromosomal map of tiling array hybridizations

MOTIVATION Tiling array is a new type of microarray that can be used to survey genomic transcriptional activities and transcription factor binding sites at high resolution. The goal of this paper is to develop effective statistical tools to identify genomic loci that show transcriptional or protein binding patterns of interest. RESULTS A two-step approach is proposed and is implemented in TileMap. In the first step, a test-statistic is computed for each probe based on a hierarchical empirical Bayes model. In the second step, the test-statistics of probes within a genomic region are used to infer whether the region is of interest or not. Hierarchical empirical Bayes model shrinks variance estimates and increases sensitivity of the analysis. It allows complex multiple sample comparisons that are essential for the study of temporal and spatial patterns of hybridization across different experimental conditions. Neighboring probes are combined through a moving average method (MA) or a hidden Markov model (HMM). Unbalanced mixture subtraction is proposed to provide approximate estimates of false discovery rate for MA and model parameters for HMM. AVAILABILITY TileMap is freely available at http://biogibbs.stanford.edu/~jihk/TileMap/index.htm. SUPPLEMENTARY INFORMATION http://biogibbs.stanford.edu/~jihk/TileMap/index.htm (includes coloured versions of all figures).

[1]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[3]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[4]  S. P. Fodor,et al.  Large-Scale Transcriptional Activity in Chromosomes 21 and 22 , 2002, Science.

[5]  Clifford A. Meyer,et al.  A hidden Markov model for analyzing ChIP-chip experiments on genome tiling arrays and its application to p53 binding sequences , 2005, ISMB.

[6]  S. Cawley,et al.  Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. , 2004, Genome research.

[7]  Philipp Kapranov,et al.  Beyond expression profiling: next generation uses of high density oligonucleotide arrays. , 2003, Briefings in functional genomics & proteomics.

[8]  S. Cawley,et al.  Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAs , 2004, Cell.

[9]  C. Morris Natural Exponential Families with Quadratic Variance Functions: Statistical Theory , 1983 .

[10]  Sandrine Dudoit,et al.  Multiple Testing Methods For ChIP - Chip High Density Oligonucleotide Array Data , 2006, J. Comput. Biol..

[11]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Deepayan Sarkar,et al.  Detecting differential gene expression with a semiparametric hierarchical mixture method. , 2004, Biostatistics.

[13]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..