Big data analytics in genomics: The point on Deep Learning solutions

Nowadays, Next Generation Sequeencing (NGS) is a catch-all term used to describe different modern DNA sequencing applications that produce big genomics data that can be analysed in a faster fashion than in the past. For this reason, NGS requires more and more sophisticated algorithms and high-performance parallel processing systems able to analyse and extract knowledge from a huge amount of genomics and molecular data. In this context, researchers are beginning to look at emerging deep learning algorithms able to perform efficient big data analytics. In this paper, we analyse and classify the major current deep learning solutions that allow biotechnology researchers to perform big genomics data analytics. Moreover, by means of a taxonomic analysis, we provide a clear picture of the current state of the art also discussing future challenges.

[1]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[2]  Marcin J. Skwark,et al.  Improved Contact Predictions Using the Recognition of Protein Like Contact Patterns , 2014, PLoS Comput. Biol..

[3]  Maria Fazio,et al.  Are Next-Generation Sequencing Tools Ready for the Cloud? , 2017, Trends in biotechnology.

[4]  Dong Xu,et al.  Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks , 2016, Scientific Reports.

[5]  Xiaohui Xie,et al.  DANN: a deep learning approach for annotating the pathogenicity of genetic variants , 2015, Bioinform..

[6]  Yoshua Bengio,et al.  Diet Networks: Thin Parameters for Fat Genomic , 2016, ICLR.

[7]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP] , 2011, IEEE Signal Processing Magazine.

[8]  Thomas Frauenfelder,et al.  Deep Learning in Mammography: Diagnostic Accuracy of a Multipurpose Image Analysis Software in the Detection of Breast Cancer , 2017, Investigative radiology.

[9]  L. Stirling Churchman,et al.  FIDDLE: An integrative deep learning framework for functional genomic data inference , 2016, bioRxiv.

[10]  Beilun Wang,et al.  Deep GDashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks , 2016, ArXiv.

[11]  Daniel Quang,et al.  DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences , 2015 .

[12]  Reza Ghaeini,et al.  A Deep Learning Approach for Cancer Detection and Relevant Gene Identification , 2017, PSB.

[13]  Yi Zhang,et al.  DeepSplice: Deep classification of novel splice junctions revealed by RNA-seq , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[14]  Beilun Wang,et al.  Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks , 2016, PSB.

[15]  Maria Fazio,et al.  New trends in Biotechnology: The point on NGS Cloud computing solutions , 2016, 2016 IEEE Symposium on Computers and Communication (ISCC).

[16]  Erik Cambria,et al.  AFFECTIVE COMPUTI G AND SENTIMENT ANALYSIS Deep Learning-Based Document Modeling for Personality Detection from Text , 2017 .

[17]  V. Bajic,et al.  DEEP: a general computational framework for predicting enhancers , 2014, Nucleic acids research.

[18]  Yoshua Bengio,et al.  Diet Networks: Thin Parameters for Fat Genomic , 2016, ArXiv.

[19]  David R. Kelley,et al.  Basset: Learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015 .

[20]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[21]  Feng Liu,et al.  PEDLA: predicting enhancers with a deep learning-based algorithmic framework , 2016, Scientific Reports.

[22]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[23]  W. Wasserman,et al.  Genome-Wide Prediction of cis-Regulatory Regions Using Supervised Deep Learning Methods , 2016, bioRxiv.

[24]  Yanjun Qi,et al.  DeepChrome: deep-learning for predicting gene expression from histone modifications , 2016, Bioinform..