Prediction of Nucleosome Positioning Based on Transcription Factor Binding Sites

Background The DNA of all eukaryotic organisms is packaged into nucleosomes, the basic repeating units of chromatin. The nucleosome consists of a histone octamer around which a DNA core is wrapped and the linker histone H1, which is associated with linker DNA. By altering the accessibility of DNA sequences, the nucleosome has profound effects on all DNA-dependent processes. Understanding the factors that influence nucleosome positioning is of great importance for the study of genomic control mechanisms. Transcription factors (TFs) have been suggested to play a role in nucleosome positioning in vivo. Principal Findings Here, the minimum redundancy maximum relevance (mRMR) feature selection algorithm, the nearest neighbor algorithm (NNA), and the incremental feature selection (IFS) method were used to identify the most important TFs that either favor or inhibit nucleosome positioning by analyzing the numbers of transcription factor binding sites (TFBSs) in 53,021 nucleosomal DNA sequences and 50,299 linker DNA sequences. A total of nine important families of TFs were extracted from 35 families, and the overall prediction accuracy was 87.4% as evaluated by the jackknife cross-validation test. Conclusions Our results are consistent with the notion that TFs are more likely to bind linker DNA sequences than the sequences in the nucleosomes. In addition, our results imply that there may be some TFs that are important for nucleosome positioning but that play an insignificant role in discriminating nucleosome-forming DNA sequences from nucleosome-inhibiting DNA sequences. The hypothesis that TFs play a role in nucleosome positioning is, thus, confirmed by the results of this study.

[1]  N. Barkai,et al.  Two strategies for gene regulation by promoter nucleosomes. , 2008, Genome research.

[2]  John J. Wyrick,et al.  Chromosomal landscape of nucleosome-dependent gene expression and silencing in yeast , 1999, Nature.

[3]  M. Vingron,et al.  Sequence-dependent nucleosome positioning. , 2009, Journal of molecular biology.

[4]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[5]  Roger D. Kornberg,et al.  Nucleosome Retention and the Stochastic Nature of Promoter Chromatin Remodeling for Transcription , 2008, Cell.

[6]  Xiangyin Kong,et al.  The impact of nucleosome positioning on the organization of replication origins in eukaryotes. , 2009, Biochemical and biophysical research communications.

[7]  M. Segal Re-Cracking the Nucleosome Positioning Code , 2008, Statistical applications in genetics and molecular biology.

[8]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[9]  Peter J. Park,et al.  nuScore: a web-interface for nucleosome positioning predictions , 2008, Bioinform..

[10]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Stephan C. Schuster,et al.  Nucleosome organization in the Drosophila genome , 2008, Nature.

[12]  Peter Delves,et al.  Encyclopedia of life sciences , 2009 .

[13]  Alfonso G. Fernandez,et al.  Nucleosome positioning determinants. , 2007, Journal of molecular biology.

[14]  Z. Weng,et al.  The Insulator Binding Protein CTCF Positions 20 Nucleosomes around Its Binding Sites across the Human Genome , 2008, PLoS genetics.

[15]  A. L. Edwards,et al.  An introduction to linear regression and correlation. , 1985 .

[16]  Lin Lu,et al.  HIV‐1 protease cleavage site prediction based on amino acid property , 2009, J. Comput. Chem..

[17]  K. Nakai,et al.  Effects of Alu elements on global nucleosome positioning in the human genome , 2010, BMC Genomics.

[18]  G. Orphanides,et al.  FACT, a Factor that Facilitates Transcript Elongation through Nucleosomes , 1998, Cell.

[19]  D. Clark,et al.  DNA Sequence Plays a Major Role in Determining Nucleosome Positions in Yeast CUP1 Chromatin* , 2001, The Journal of Biological Chemistry.

[20]  Alexander J. Hartemink,et al.  A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast , 2007, PLoS Comput. Biol..

[21]  B. Steensel,et al.  Whole-genome views of chromatin structure , 2005, Chromosome Research.

[22]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[23]  Daria A. Gaykalova,et al.  Nucleosomes can form a polar barrier to transcript elongation by RNA polymerase II. , 2006, Molecular cell.

[24]  K. Luger Nucleosomes: Structure and Function , 2001 .

[25]  Yu-Dong Cai,et al.  A novel computational method to predict transcription factor DNA binding preference. , 2006, Biochemical and biophysical research communications.

[26]  Kevin Struhl,et al.  Intrinsic histone-DNA interactions and low nucleosome density are important for preferential accessibility of promoter regions in yeast. , 2005, Molecular cell.

[27]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[28]  D. Fitzgerald,et al.  DNA distortion as a factor in nucleosome positioning. , 1999, Journal of molecular biology.

[29]  R. Kingston,et al.  Cooperation between Complexes that Regulate Chromatin Structure and Transcription , 2002, Cell.

[30]  Grace Jordison Molecular Biology of the Gene , 1965, The Yale Journal of Biology and Medicine.

[31]  Steven J. M. Jones,et al.  Dynamic Remodeling of Individual Nucleosomes Across a Eukaryotic Genome in Response to Transcriptional Perturbation , 2007, PLoS biology.

[32]  R. Kornberg,et al.  Twenty-Five Years of the Nucleosome, Fundamental Particle of the Eukaryote Chromosome , 1999, Cell.

[33]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[34]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[35]  Irene K. Moore,et al.  The DNA-encoded nucleosome organization of a eukaryotic genome , 2009, Nature.

[36]  Bryan J Venters,et al.  A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome. , 2008, Genome research.

[37]  M. Pellegrini,et al.  Relationship between nucleosome positioning and DNA methylation , 2010, Nature.

[38]  Alfonso G. Fernandez,et al.  Oligonucleotide Sequence Motifs as Nucleosome Positioning Signals , 2010, PloS one.

[39]  Teuvo Kohonen,et al.  An introduction to neural computing , 1988, Neural Networks.

[40]  I. Albert,et al.  Nucleosome positions predicted through comparative genomics , 2006, Nature Genetics.

[41]  William Stafford Noble,et al.  Nucleosome positioning signals in genomic DNA. , 2007, Genome research.

[42]  Kami Ahmad,et al.  Rules and regulation in the primary structure of chromatin. , 2007, Current opinion in cell biology.

[43]  Oliver J. Rando,et al.  Chromatin remodelling at promoters suppresses antisense transcription , 2007, Nature.

[44]  Boshu Liu,et al.  Predicting Protein N-glycosylation by Combining Functional Domain and Secretion Information , 2007, Journal of biomolecular structure & dynamics.

[45]  Sevinç Ercan,et al.  Global Chromatin Structure of 45,000 Base Pairs of Chromosome III in a- and α-Cell Yeast and during Mating-Type Switching , 2004, Molecular and Cellular Biology.

[46]  Samuel Kaski,et al.  Self-Organized Formation of Various Invariant-Feature Filters in the Adaptive-Subspace SOM , 1997, Neural Computation.

[47]  I. Albert,et al.  Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome , 2007, Nature.

[48]  Yixue Li,et al.  An approach to predict transcription factor DNA binding site specificity based upon gene and transcription factor functional categorization , 2007, Bioinform..

[49]  K. Seifart,et al.  A nucleosome positioned in the distal promoter region activates transcription of the human U6 gene , 1997, Molecular and cellular biology.

[50]  Panu Somervuo,et al.  Self-organizing maps of symbol strings , 1998, Neurocomputing.

[51]  Guo-Cheng Yuan,et al.  Genomic Sequence Is Highly Predictive of Local Nucleosome Depletion , 2007, PLoS Comput. Biol..

[52]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[53]  Xianhua Dai,et al.  An Improved Gibbs Sampling Algorithm for Finding TFBS , 2005, CIS.

[54]  Lin Lu,et al.  A novel computational approach to predict transcription factor DNA binding preference. , 2009, Journal of proteome research.

[55]  Ronald W. Davis,et al.  A high-resolution atlas of nucleosome occupancy in yeast , 2007, Nature Genetics.

[56]  Jerry L. Workman,et al.  ATP-Dependent Chromatin-Remodeling Complexes , 2000, Molecular and Cellular Biology.

[57]  Irene K. Moore,et al.  A genomic code for nucleosome positioning , 2006, Nature.

[58]  Lani F. Wu,et al.  Genome-Scale Identification of Nucleosome Positions in S. cerevisiae , 2005, Science.

[59]  Vinesh Vinayachandran,et al.  Nucleosome positioning in relation to nucleosome spacing and DNA sequence‐specific binding of a protein , 2007, The FEBS journal.

[60]  Thomas Werner,et al.  MatInspector and beyond: promoter analysis based on transcription factor binding sites , 2005, Bioinform..

[61]  S. Schreiber,et al.  Global nucleosome occupancy in yeast , 2004, Genome Biology.

[62]  W. Hörz,et al.  A functional role for nucleosomes in the repression of a yeast promoter. , 1991, The EMBO journal.

[63]  J. Lieb,et al.  Evidence for nucleosome depletion at active regulatory regions genome-wide , 2004, Nature Genetics.

[64]  Songnian Hu,et al.  A novel DNA sequence periodicity decodes nucleosome positioning , 2008, Nucleic acids research.

[65]  Xiao Sun,et al.  Characteristics of nucleosome core DNA and their applications in predicting nucleosome positions. , 2008, Biophysical journal.

[66]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.