NOrMAL: accurate nucleosome positioning using a modified Gaussian mixture model

Motivation: Nucleosomes are the basic elements of chromatin structure. They control the packaging of DNA and play a critical role in gene regulation by allowing physical access to transcription factors. The advent of second-generation sequencing has enabled landmark genome-wide studies of nucleosome positions for several model organisms. Current methods to determine nucleosome positioning first compute an occupancy coverage profile by mapping nucleosome-enriched sequenced reads to a reference genome; then, nucleosomes are placed according to the peaks of the coverage profile. These methods are quite accurate on placing isolated nucleosomes, but they do not properly handle more complex configurations. Also, they can only provide the positions of nucleosomes and their occupancy level, whereas it is very beneficial to supply molecular biologists additional information about nucleosomes like the probability of placement, the size of DNA fragments enriched for nucleosomes and/or whether nucleosomes are well positioned or ‘fuzzy’ in the sequenced cell sample. Results: We address these issues by providing a novel method based on a parametric probabilistic model. An expectation maximization algorithm is used to infer the parameters of the mixture of distributions. We compare the performance of our method on two real datasets against Template Filtering, which is considered the current state-of-the-art. On synthetic data, we show that our method can resolve more accurately complex configurations of nucleosomes, and it is more robust to user-defined parameters. On real data, we show that our method detects a significantly higher number of nucleosomes. Availability: Visit http://www.cs.ucr.edu/~polishka Contact: stelo@cs.ucr.edu or polishka@cs.ucr.edu

[1]  K. Zaret,et al.  Micrococcal Nuclease Analysis of Chromatin Structure , 2005, Current protocols in molecular biology.

[2]  Stephan C. Schuster,et al.  Nucleosome organization in the Drosophila genome , 2008, Nature.

[3]  Sumio Sugano,et al.  Chromatin-Associated Periodicity in Genetic Variation Downstream of Transcriptional Start Sites , 2009, Science.

[4]  Nir Friedman,et al.  High-resolution nucleosome mapping reveals transcription-dependent promoter packaging. , 2010, Genome research.

[5]  Steven J. M. Jones,et al.  Dynamic Remodeling of Individual Nucleosomes Across a Eukaryotic Genome in Response to Transcriptional Perturbation , 2007, PLoS biology.

[6]  Patricia De la Vega,et al.  Discovery of Gene Function by Expression Profiling of the Malaria Parasite Life Cycle , 2003, Science.

[7]  James Allan,et al.  Micrococcal Nuclease Does Not Substantially Bias Nucleosome Mapping , 2012, Journal of molecular biology.

[8]  Noam Kaplan,et al.  Gene expression divergence in yeast is coupled to evolution of DNA-encoded nucleosome organization , 2009, Nature Genetics.

[9]  B. Franklin Pugh,et al.  High-Resolution Genome-wide Mapping of the Primary Structure of Chromatin , 2011, Cell.

[10]  S. Lonardi,et al.  Nucleosome occupancy at transcription start sites in the human malaria parasite: a hard-wired evolution of virulence? , 2011, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[11]  Bryan J Venters,et al.  A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome. , 2008, Genome research.

[12]  I. Albert,et al.  Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome , 2007, Nature.

[13]  Irene K. Moore,et al.  The DNA-encoded nucleosome organization of a eukaryotic genome , 2009, Nature.

[14]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[15]  S. Lonardi,et al.  Supplemental Material to : Nucleosome landscape and control of transcription in the human malaria parasite , 2009 .

[16]  Steven M. Johnson,et al.  A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. , 2008, Genome research.

[17]  Yaniv Lubling,et al.  Distinct Modes of Regulation by Chromatin Encoded through Nucleosome Positioning Signals , 2008, PLoS Comput. Biol..