Nucleosome positioning from tiling microarray data

Motivation: The packaging of DNA around nucleosomes in eukaryotic cells plays a crucial role in regulation of gene expression, and other DNA-related processes. To better understand the regulatory role of nucleosomes, it is important to pinpoint their position in a high (5–10 bp) resolution. Toward this end, several recent works used dense tiling arrays to map nucleosomes in a high-throughput manner. These data were then parsed and hand-curated, and the positions of nucleosomes were assessed. Results: In this manuscript, we present a fully automated algorithm to analyze such data and predict the exact location of nucleosomes. We introduce a method, based on a probabilistic graphical model, to increase the resolution of our predictions even beyond that of the microarray used. We show how to build such a model and how to compile it into a simple Hidden Markov Model, allowing for a fast and accurate inference of nucleosome positions. We applied our model to nucleosomal data from mid-log yeast cells reported by Yuan et al. and compared our predictions to those of the original paper; to a more recent method that uses five times denser tiling arrays as explained by Lee et al.; and to a curated set of literature-based nucleosome positions. Our results suggest that by applying our algorithm to the same data used by Yuan et al. our fully automated model traced 13% more nucleosomes, and increased the overall accuracy by about 20%. We believe that such an improvement opens the way for a better understanding of the regulatory mechanisms controlling gene expression, and how they are encoded in the DNA. Contact: nir@cs.huji.ac.il

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[3]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[5]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[6]  D. Latchman Gene Regulation: A Eukaryotic Perspective , 1990 .

[7]  J. Schmitz,et al.  A nucleosome precludes binding of the transcription factor Pho4 in vivo to a critical target site in the PHO5 promoter. , 1994, The EMBO journal.

[8]  M. Borodovsky,et al.  Nucleosome DNA sequence pattern revealed by multiple alignment of experimentally mapped sequences. , 1996, Journal of molecular biology.

[9]  T. Richmond,et al.  Crystal structure of the nucleosome core particle at 2.8 Å resolution , 1997, Nature.

[10]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[11]  Nir Friedman,et al.  Likelihood Computations Using Value Abstraction , 2000, UAI.

[12]  P. Gregory,et al.  A transient histone hyperacetylation signal marks nucleosomes for remodeling at the PHO8 promoter in vivo. , 2001, Molecular cell.

[13]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[14]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[15]  J. Lieb,et al.  Evidence for nucleosome depletion at active regulatory regions genome-wide , 2004, Nature Genetics.

[16]  Lani F. Wu,et al.  Genome-Scale Identification of Nucleosome Positions in S. cerevisiae , 2005, Science.

[17]  S. Schreiber,et al.  Histone Variant H2A.Z Marks the 5′ Ends of Both Active and Inactive Genes in Euchromatin , 2006, Cell.

[18]  Megan F. Cole,et al.  Genome-wide Map of Nucleosome Acetylation and Methylation in Yeast , 2005, Cell.

[19]  Irene K. Moore,et al.  A genomic code for nucleosome positioning , 2006, Nature.

[20]  J. Lieb,et al.  A chromatin-mediated mechanism for specification of conditional transcription factor targets , 2006, Nature Genetics.

[21]  I. Albert,et al.  Nucleosome positions predicted through comparative genomics , 2006, Nature Genetics.

[22]  Alexander J. Hartemink,et al.  A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast , 2007, PLoS Comput. Biol..

[23]  Ronald W. Davis,et al.  A high-resolution atlas of nucleosome occupancy in yeast , 2007, Nature Genetics.

[24]  William Stafford Noble,et al.  Nucleosome positioning signals in genomic DNA. , 2007, Genome research.

[25]  I. Albert,et al.  Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome , 2007, Nature.

[26]  Yair Weiss,et al.  MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies , 2007, UAI.

[27]  Dustin E. Schones,et al.  Dynamic Regulation of Nucleosome Positioning in the Human Genome , 2008, Cell.

[28]  Guo-Cheng Yuan,et al.  Genomic Sequence Is Highly Predictive of Local Nucleosome Depletion , 2007, PLoS Comput. Biol..

[29]  Steven J. M. Jones,et al.  Dynamic Remodeling of Individual Nucleosomes Across a Eukaryotic Genome in Response to Transcriptional Perturbation , 2007, PLoS biology.