Recent advancement in next-generation sequencing techniques and its computational analysis

Next Generation Sequencing (NGS), a recently evolved technology, have served a lot in the research and development sector of our society. This novel approach is a newbie and has critical advantages over the traditional Capillary Electrophoresis (CE) based Sanger Sequencing. The advancement of NGS has led to numerous important discoveries, which could have been costlier and time taking in case of traditional CE based Sanger sequencing. NGS methods are highly parallelized enabling to sequence thousands to millions of molecules simultaneously. This technology results into huge amount of data, which need to be analysed to conclude valuable information. Specific data analysis algorithms are written for specific task to be performed. The algorithms in group, act as a tool in analysing the NGS data. Analysis of NGS data unravels important clues in quest for the treatment of various life-threatening diseases; improved crop varieties and other related scientific problems related to human welfare. In this review, an effort was made to address basic background of NGS technologies, possible applications, computational approaches and tools involved in NGS data analysis, future opportunities and challenges in the area.

[1]  Tongbin Li,et al.  miRecords: an integrated resource for microRNA–target interactions , 2008, Nucleic Acids Res..

[2]  K. Kinzler,et al.  Detection and quantification of rare mutations with massively parallel sequencing , 2011, Proceedings of the National Academy of Sciences.

[3]  Khalid Raza,et al.  Reconstruction and Analysis of Cancer-specific Gene Regulatory Networks from Gene Expression Profiles , 2013, ArXiv.

[4]  Johnf . Thompson,et al.  Single Molecule Sequencing with a HeliScope Genetic Analysis System , 2010, Current protocols in molecular biology.

[5]  Peter F. Stadler,et al.  DARIO: a ncRNA detection and analysis tool for next-generation sequencing experiments , 2011, Nucleic Acids Res..

[6]  Khalid Raza,et al.  A Comprehensive Evaluation of Machine Learning Techniques for Cancer Class Prediction Based on Microarray Data , 2013, Int. J. Bioinform. Res. Appl..

[7]  Rajkumar Buyya,et al.  Aneka: a Software Platform for .NET based Cloud Computing , 2009, High Performance Computing Workshop.

[8]  Khalid Raza,et al.  Evolutionary algorithms in genetic regulatory networks model , 2012, ArXiv.

[9]  B. Langmead,et al.  Cloud-scale RNA-sequencing differential expression analysis with Myrna , 2010, Genome Biology.

[10]  R. Daber,et al.  Understanding the limitations of next generation sequencing informatics, an approach to clinical pipeline validation using artificial data sets. , 2013, Cancer genetics.

[11]  C. Thermes,et al.  Ten years of next-generation sequencing technology. , 2014, Trends in genetics : TIG.

[12]  D. Bartel,et al.  Expanded identification and characterization of mammalian circular RNAs , 2014, Genome Biology.

[13]  K. White,et al.  ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis , 2011, BMC Genomics.

[14]  S. Ness,et al.  Microarray analysis: basic strategies for successful experiments , 2007, Molecular biotechnology.

[15]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[16]  Khalid Raza,et al.  Analysis of Microarray Data using Artificial Intelligence Based Techniques , 2015, Biotechnology.

[17]  Pierre Tufféry,et al.  BIOINFORMATICS ORIGINAL PAPER , 2022 .

[18]  Heidi Ledford,et al.  The death of microarrays? , 2008, Nature.

[19]  Yadong Wang,et al.  miR2Disease: a manually curated database for microRNA deregulation in human disease , 2008, Nucleic Acids Res..

[20]  F. Sanger,et al.  DNA sequencing with chain-terminating inhibitors. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[22]  R. Giegerich,et al.  Fast and effective prediction of microRNA/target duplexes. , 2004, RNA.

[23]  A. Kasarskis,et al.  A window into third-generation sequencing. , 2010, Human molecular genetics.

[24]  M. Marra,et al.  Applications of next-generation sequencing technologies in functional genomics. , 2008, Genomics.

[25]  Weiying Zhou,et al.  UC Office of the President Recent Work Title De novo sequencing of circulating miRNAs identifies novel markers predicting clinical outcome of locally advanced breast cancer , 2012 .

[26]  Mauricio O. Carneiro,et al.  The advantages of SMRT sequencing , 2013, Genome Biology.

[27]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[28]  K. Voelkerding,et al.  Next-generation sequencing: from basic research to diagnostics. , 2009, Clinical chemistry.

[29]  Xi Chen,et al.  A five-microRNA signature identified from genome-wide serum microRNA expression profiling serves as a fingerprint for gastric cancer diagnosis. , 2011, European journal of cancer.

[30]  David R. Riley,et al.  CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing , 2011, BMC Bioinformatics.

[31]  J. Archer,et al.  Use of Four Next-Generation Sequencing Platforms to Determine HIV-1 Coreceptor Tropism , 2012, PloS one.

[32]  Alessandro Guffanti,et al.  Emerging bioinformatics approaches for analysis of NGS-derived coding and non-coding RNAs in neurodegenerative diseases , 2014, Front. Cell. Neurosci..

[33]  Florian Halbritter,et al.  GeneProf: analysis of high-throughput sequencing experiments , 2011, Nature Methods.

[34]  Doron Betel,et al.  The microRNA.org resource: targets and expression , 2007, Nucleic Acids Res..

[35]  Michael P Snyder,et al.  High-throughput sequencing for biology and medicine , 2013, Molecular systems biology.

[36]  M. Buck,et al.  Obesity and Ovarian Cancer Survival: A Systematic Review and Meta-analysis , 2012, Cancer Prevention Research.

[37]  C M Lushbough,et al.  An overview of the BioExtract Server: a distributed, Web-based system for genomic analysis. , 2010, Advances in experimental medicine and biology.

[38]  Hagai Bergman,et al.  Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing , 2014, PLoS Comput. Biol..

[39]  Ugur Sahin,et al.  RNA-Seq Atlas - a reference database for gene expression profiling in normal tissue by next-generation sequencing , 2012, Bioinform..

[40]  David Haussler,et al.  ENCODE whole-genome data in the UCSC genome browser (2011 update) , 2010, Nucleic Acids Res..

[41]  Robert D. Finn,et al.  Rfam: Wikipedia, clans and the “decimal” release , 2010, Nucleic Acids Res..

[42]  Mahmoud Taleb Beidokhti,et al.  Advances in Intelligent Systems and Computing , 2016 .

[43]  Hideaki Sugawara,et al.  Archiving next generation sequencing data , 2009, Nucleic Acids Res..

[44]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[45]  Erika Check Hayden,et al.  Genome sequencing: the third generation , 2009, Nature.

[46]  James Taylor,et al.  Next-generation sequencing data interpretation: enhancing reproducibility and accessibility , 2012, Nature Reviews Genetics.

[47]  Michael J. Becich,et al.  Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics , 2012, Journal of pathology informatics.

[48]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2011 , 2011, Nucleic Acids Res..

[49]  Feng Yang,et al.  Up‐regulated long non‐coding RNA H19 contributes to proliferation of gastric cancer cells , 2012, The FEBS journal.

[50]  Sergio Verjovski-Almeida,et al.  Long noncoding intronic RNAs are differentially expressed in primary and metastatic pancreatic cancer , 2011, Molecular Cancer.

[51]  H. Czosnek,et al.  Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology , 2014, Viruses.

[52]  Annelise E Barron,et al.  Advantages and limitations of next‐generation sequencing technologies: A comparison of electrophoresis and non‐electrophoresis methods , 2008, Electrophoresis.

[53]  M. Schatz,et al.  Searching for SNPs with cloud computing , 2009, Genome Biology.

[54]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[55]  Steven R. Head,et al.  Next-generation sequencing , 2010, Nature Reviews Drug Discovery.

[56]  Asaf Levy,et al.  Discovery of microRNAs and other small RNAs in solid tumors , 2010, Nucleic acids research.

[57]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[58]  Nectarios Koziris,et al.  DIANA-microT web server: elucidating microRNA functions through target prediction , 2009, Nucleic Acids Res..

[59]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[60]  K. Sun,et al.  iSeeRNA: identification of long intergenic non-coding RNA transcripts from transcriptome sequencing data , 2013, BMC Genomics.

[61]  Masayuki Matsui,et al.  Promoter RNA links transcriptional regulation of inflammatory pathway genes , 2013, Nucleic acids research.

[62]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[63]  Khalid Raza Reconstruction, Topological and Gene Ontology Enrichment Analysis of Cancerous Gene Regulatory Network Modules , 2016 .

[64]  N. Rajewsky,et al.  Discovering microRNAs from deep sequencing data using miRDeep , 2008, Nature Biotechnology.

[65]  Zhen Jiang,et al.  piRNA, the new non-coding RNA, is aberrantly expressed in human cancer cells. , 2011, Clinica chimica acta; international journal of clinical chemistry.

[66]  W. Ansorge Next-generation DNA sequencing techniques. , 2009, New biotechnology.

[67]  Tao Liu,et al.  Computational methodology for ChIP-seq analysis , 2013, Quantitative Biology.

[68]  H. Dweep,et al.  miRWalk2.0: a comprehensive atlas of microRNA-target interactions , 2015, Nature Methods.

[69]  Xiaowei Wang,et al.  Sequence analysis Prediction of both conserved and nonconserved microRNA targets in animals , 2007 .

[70]  Khalid Raza,et al.  Soft Computing Approach for Modeling Genetic Regulatory Networks , 2012, ACITY.

[71]  Ying Li,et al.  HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data , 2014, BMC Bioinformatics.

[72]  G. Hannon,et al.  The Piwi-piRNA Pathway Provides an Adaptive Defense in the Transposon Arms Race , 2007, Science.

[73]  Alfredo Ferro,et al.  Computational Approaches for the Analysis of ncRNA through Deep Sequencing Techniques , 2015, Front. Bioeng. Biotechnol..

[74]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[75]  Zhongming Zhao,et al.  NGS catalog: A database of next generation sequencing studies in humans , 2012, Human mutation.

[76]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[77]  Khalid Raza,et al.  Reconstruction of gene regulatory network of colon cancer using information theoretic approach , 2013, ArXiv.

[78]  C. Nelson,et al.  miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data , 2012, Nucleic acids research.

[79]  Emmanuel Barillot,et al.  ncPRO-seq: a tool for annotation and profiling of ncRNAs in sRNA-seq data , 2012, Bioinform..

[80]  Sebastian D. Mackowiak,et al.  Circular RNAs are a large class of animal RNAs with regulatory potency , 2013, Nature.

[81]  Yanni Sun,et al.  RNA-CODE: A Noncoding RNA Classification Tool for Short Reads in NGS Data Lacking Reference Genomes , 2013, PloS one.

[82]  Martin Reczko,et al.  The database of experimentally supported targets: a functional update of TarBase , 2008, Nucleic Acids Res..

[83]  Fabio Passetti,et al.  Bioinformatics of Cancer ncRNA in High Throughput Sequencing: Present State and Challenges , 2012, Front. Gene..

[84]  Hiroaki Sakai,et al.  The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome , 2015, Scientific Reports.

[85]  Wei Wu,et al.  NONCODEv4: exploring the world of long non-coding RNA genes , 2013, Nucleic Acids Res..

[86]  References , 1971 .

[87]  Khalid Raza,et al.  Clustering analysis of cancerous microarray data , 2014 .

[88]  Yu Xue,et al.  CPSS: a computational platform for the analysis of small RNA deep sequencing data , 2012, Bioinform..

[89]  Ayman Grada,et al.  Next-generation sequencing: methodology and application. , 2013, The Journal of investigative dermatology.

[90]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[91]  Khalid Raza,et al.  Ant colony optimization for inferring key gene interactions , 2014, 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom).