Prediction of Nucleosome Forming and Nucleosome Inhibiting DNA Sequences Using Convolutional Neural Networks

The nucleosome is the fundamental unit of DNA packaging in eukaryotic cells. Nucleosomes play a major role in the regulation of gene activity as they act like barriers that control the transcription by allowing or blocking DNA-binding proteins, transcription factors, and activators from accessing to DNA. In this paper, we proposed a deep learning method based on a convolutional neural network (CNN) that can accurately distinct nucleosome-forming from nucleosome-inhibiting DNA sequences. We utilize dataset of three different organisms, and for each dataset, we conduct two different experiments. We exploited $k$-mer frequency count as input as well as using the DNA sequence directly. The proposed CNN models nicely recognize the different DNA sequences by learning the key region and features that characterize each DNA sequence type. We compare our result with the previous works based on the evaluation conducted on three different datasets including Human, Fly, and Worm. The proposed CNN models achieved comparable results.

[1]  Lei Xia,et al.  Predicting nucleosome positioning using a duration Hidden Markov Model , 2010, BMC Bioinformatics.

[2]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[3]  D. Barash,et al.  Single-base Resolution Nucleosome Mapping on DNA Sequences , 2010, Journal of biomolecular structure & dynamics.

[4]  Irene K. Moore,et al.  The DNA-encoded nucleosome organization of a eukaryotic genome , 2009, Nature.

[5]  E. Segal,et al.  What controls nucleosome positions? , 2009, Trends in genetics : TIG.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  R. Kornberg,et al.  Twenty-Five Years of the Nucleosome, Fundamental Particle of the Eukaryote Chromosome , 1999, Cell.

[8]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Akinori Awazu,et al.  Prediction of nucleosome positioning by the incorporation of frequencies and distributions of three different nucleotide segment lengths into a general pseudo k-tuple nucleotide composition , 2016, Bioinform..

[11]  Irene K. Moore,et al.  A genomic code for nucleosome positioning , 2006, Nature.

[12]  Kil To Chong,et al.  Branch Point Selection in RNA Splicing Using Deep Learning , 2019, IEEE Access.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Hilal Tayara,et al.  Deep Learning Models Based on Distributed Feature Representations for Alternative Splicing Prediction , 2018, IEEE Access.

[15]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[16]  E. Trifonov,et al.  Nucleosome DNA Bendability Matrix (C. elegans) , 2009, Journal of biomolecular structure & dynamics.

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Maqsood Hayat,et al.  iNuc-STNC: a sequence-based predictor for identification of nucleosome positioning in genomes by extending the concept of SAAC and Chou's PseAAC. , 2016, Molecular bioSystems.

[19]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[20]  William Stafford Noble,et al.  Predicting Human Nucleosome Occupancy from Primary Sequence , 2008, PLoS Comput. Biol..

[21]  Yu Zhang,et al.  DNA sequence feature selection for intrinsic nucleosome positioning signals using AdaBoost , 2010, BCB '10.

[22]  William Stafford Noble,et al.  Nucleosome positioning signals in genomic DNA. , 2007, Genome research.

[23]  Wei Chen,et al.  iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition , 2014, Bioinform..

[24]  O. Stegle,et al.  DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning , 2016, Genome Biology.

[25]  Kil To Chong,et al.  iRNA-PseKNC(2methyl): Identify RNA 2'-O-methylation sites by convolution neural network and Chou's pseudo components. , 2019, Journal of theoretical biology.

[26]  T. Richmond,et al.  The structure of DNA in the nucleosome core , 2003, Nature.