Use of Bioinformatics Tools in Different Spheres of Life Sciences

The pace, by which scientific knowledge is being produced and shared today, was never been so fast in the past. Different areas of science are getting closer to each other to give rise new disciplines. Bioinformatics is one of such newly emerging fields, which makes use of computer, mathematics and statistics in molecular biology to archive, retrieve, and analyse biological data. Although yet at infancy, it has become one of the fastest growing fields, and quickly established itself as an integral component of any biological research activity. It is getting popular due to its ability to analyse huge amount of biological data quickly and cost-effectively. Bioinformatics can assist a biologist to extract valuable information from biological data providing various web- and/or computer-based tools, the majority of which are freely available. The present review gives a comprehensive summary of some of these tools available to a life scientist to analyse biological data. Exclusively this review will focus on those areas of biological research, which can be greatly assisted by such tools like analysing a DNA and protein sequence to identify various features, prediction of 3D structure of protein molecules, to study molecular interactions, and to perform simulations to mimic a biological phenomenon to extract useful information from the biological data.

[1]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[2]  Ashok Reddy Dinasarapu,et al.  Signaling gateway molecule pages - a data model perspective , 2011, Bioinform..

[3]  M J Sternberg,et al.  Model building by comparison at CASP3: Using expert knowledge and computer automation , 1999, Proteins.

[4]  Scott L. Zeger,et al.  The Analysis of Gene Expression Data: An Overview of Methods and Software , 2003 .

[5]  A. Mencalha,et al.  Conserved transcription factor binding sites suggest an activator basal promoter and a distal inhibitor in the galanin gene promoter in mouse ES cells. , 2014, Gene.

[6]  Peer Bork,et al.  SMART 7: recent updates to the protein domain annotation resource , 2011, Nucleic Acids Res..

[7]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[8]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[9]  Kuo-Chen Chou,et al.  A sequence-based approach for predicting protein disordered regions. , 2013, Protein and peptide letters.

[10]  S. Salzberg,et al.  Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake , 2007, Genome Biology.

[11]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[12]  Bryan L Roth,et al.  DREADDs: novel tools for drug discovery and development. , 2014, Drug discovery today.

[13]  Geoffrey J. Barton,et al.  The Jalview Java alignment editor , 2004, Bioinform..

[14]  E. Berg Systems biology in drug discovery and development. , 2014, Drug discovery today.

[15]  Aparoop Das,et al.  In-Silico Drug Design: A revolutionary approach to change the concept of current Drug Discovery Process , 2013 .

[16]  T. Blundell,et al.  Knowledge based modelling of homologous proteins, Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. , 1987, Protein engineering.

[17]  Yves Moreau,et al.  Protein fold recognition using geometric kernel data fusion , 2014, Bioinform..

[18]  Burkhard Rost,et al.  PHD - an automatic mail server for protein secondary structure prediction , 1994, Comput. Appl. Biosci..

[19]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[20]  Paulien Hogeweg,et al.  The Roots of Bioinformatics in Theoretical Biology , 2011, PLoS Comput. Biol..

[21]  Christos A. Ouzounis,et al.  Rise and Demise of Bioinformatics? Promise and Progress , 2012, PLoS Comput. Biol..

[22]  Annette Lee,et al.  Genome-wide association study implicates NDST3 in schizophrenia and bipolar disorder , 2013, Nature Communications.

[23]  R. Baker,et al.  Timeframes of speciation, reticulation, and hybridization in the bulldog bat explained through phylogenetic analyses of all genetic transmission elements. , 2014, Systematic biology.

[24]  Jochen S. Hub,et al.  Mechanism of selectivity in aquaporins and aquaglyceroporins , 2008, Proceedings of the National Academy of Sciences.

[25]  C. van Broeckhoven,et al.  novoSNP, a novel computational tool for sequence variation discovery. , 2005, Genome research.

[26]  Courtney Corley,et al.  Topological analysis of protein co-abundance networks identifies novel host targets important for HCV infection and pathogenesis , 2012, BMC Systems Biology.

[27]  S. Dibyajyoti,et al.  Bioinformatics:The effects on the cost of drug discovery , 2013 .

[28]  Sean R. Eddy,et al.  Rfam 11.0: 10 years of RNA families , 2012, Nucleic Acids Res..

[29]  Hideaki Sugawara,et al.  DNA Data Bank of Japan (DDBJ) in XML , 2003, Nucleic Acids Res..

[30]  Yun Zhang,et al.  On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types , 2013, BMC Bioinformatics.

[31]  Holger Gohlke,et al.  The Amber biomolecular simulation programs , 2005, J. Comput. Chem..

[32]  Roderic D. M. Page,et al.  TreeView: an application to display phylogenetic trees on personal computers , 1996, Comput. Appl. Biosci..

[33]  Martin Zacharias,et al.  Role of Tryptophan Side Chain Dynamics on the Trp-Cage Mini-Protein Folding Studied by Molecular Dynamics Simulations , 2014, PloS one.

[34]  Ying Zhang,et al.  HMDB: the Human Metabolome Database , 2007, Nucleic Acids Res..

[35]  P Reddanna,et al.  Computer aided drug design approaches to develop cyclooxygenase based novel anti-inflammatory and anti-cancer drugs. , 2007, Current pharmaceutical design.

[36]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[37]  David S. Goodsell,et al.  AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility , 2009, J. Comput. Chem..

[38]  Peter B. McGarvey,et al.  The Protein Information Resource (PIR) , 2000, Nucleic Acids Res..

[39]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using MODELLER , 2016, Current protocols in bioinformatics.

[40]  M. Pagel,et al.  Phylogenetic Analysis and Comparative Data: A Test and Review of Evidence , 2002, The American Naturalist.

[41]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[42]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[43]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[44]  Ying Cheng,et al.  The European Nucleotide Archive , 2010, Nucleic Acids Res..

[45]  Minoru Kanehisa,et al.  The KEGG database. , 2002, Novartis Foundation symposium.

[46]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[47]  Ulf Leser,et al.  GeneView: a comprehensive semantic search engine for PubMed , 2012, Nucleic Acids Res..

[48]  Jack Y. Yang,et al.  BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features , 2010, BMC Systems Biology.

[49]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..

[50]  D. Wishart Bioinformatics in Drug Development and Assessment , 2005, Drug metabolism reviews.

[51]  Dariusz Plewczynski,et al.  Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database , 2011, J. Comput. Chem..

[52]  N. P. Brown,et al.  The GeneQuiz web server: protein functional analysis through the Web. , 2000, Trends in biochemical sciences.

[53]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[54]  Eoin Fahy,et al.  Bioinformatics and systems biology of the lipidome. , 2011, Chemical reviews.

[55]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[56]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[57]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[58]  Pedro M. Coutinho,et al.  The carbohydrate-active enzymes database (CAZy) in 2013 , 2013, Nucleic Acids Res..

[59]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[60]  Anastassios Pouris,et al.  A bibliometric study of bioinformatics research in South Africa , 2007, Scientometrics.

[61]  M. Cordeiro,et al.  Computer-aided drug design, synthesis and evaluation of new anti-cancer drugs. , 2012, Current topics in medicinal chemistry.

[62]  María Martín,et al.  The Universal Protein Resource (UniProt) in 2010 , 2010 .

[63]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[64]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[65]  Jian Peng,et al.  Template-based protein structure modeling using the RaptorX web server , 2012, Nature Protocols.

[66]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[67]  Z. Oltvai,et al.  Identifying Ligand Binding Conformations of the β2-Adrenergic Receptor by Using Its Agonists as Computational Probes , 2012, PloS one.

[68]  A. Bonvin,et al.  The HADDOCK web server for data-driven biomolecular docking , 2010, Nature Protocols.

[69]  Vladimir Makarenkov,et al.  T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks , 2012, Nucleic Acids Res..

[70]  S T Cole,et al.  Analysis of the proteome of Mycobacterium tuberculosis in silico. , 1999, Tubercle and lung disease : the official journal of the International Union against Tuberculosis and Lung Disease.

[71]  Ying Lu,et al.  The draft genome of the fast-growing non-timber forest species moso bamboo (Phyllostachys heterocycla) , 2013, Nature Genetics.

[72]  P. Nixon,et al.  Investigating the Production of Foreign Membrane Proteins in Tobacco Chloroplasts: Expression of an Algal Plastid Terminal Oxidase , 2012, PloS one.

[73]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[74]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[75]  Yu Li,et al.  Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction , 2011, Bioinform..

[76]  M. Natália D. S. Cordeiro,et al.  Editorial (Hot Topic: Computer-Aided Drug Design, Synthesis and Evaluation of New Anti-Cancer Drugs) , 2013 .

[77]  Ashok Reddy Dinasarapu,et al.  CMAP: Complement Map Database , 2013, Bioinform..

[78]  Michael,et al.  Branching Enzyme IIb , a Key Enzyme in Sites on Maize Endosperm Starch Identification of Multiple Phosphorylation Plant Biology : , 2014 .

[79]  Steven E. Brenner,et al.  SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures , 2013, Nucleic Acids Res..

[80]  Renata C Geer,et al.  Entrez: making use of its power. , 2003, Briefings in bioinformatics.

[81]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[82]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2009 update , 2009, Nucleic Acids Res..

[83]  Lincoln Stein,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Res..

[84]  David S. Roos,et al.  TDR Targets: a chemogenomics resource for neglected diseases , 2011, Nucleic Acids Res..

[85]  B. Honig,et al.  Structure-based prediction of protein-protein interactions on a genome-wide scale , 2012, Nature.

[86]  Akira R. Kinjo,et al.  Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format , 2011, Nucleic Acids Res..

[87]  Pramod Katara,et al.  Role of bioinformatics and pharmacogenomics in drug discovery and development process , 2013, Network Modeling Analysis in Health Informatics and Bioinformatics.

[88]  Ruth Nussinov,et al.  Structure and dynamics of molecular networks: A novel paradigm of drug discovery. A comprehensive review , 2012, Pharmacology & therapeutics.

[89]  Bala Kalyanasundaram,et al.  Web-based interface facilitating sequence-to-structure analysis of BLAST alignment reports. , 2005, BioTechniques.

[90]  Steven Salzberg,et al.  JIGSAW: integration of multiple sources of evidence for gene prediction , 2005, Bioinform..

[91]  Rafael C. Jimenez,et al.  The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases , 2013, Nucleic Acids Res..

[92]  Domain wise docking analyses of the modular chitin binding protein CBP50 from Bacillus thuringiensis serovar konkukian S4 , 2013, Bioinformation.

[93]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[94]  Xiang Xiao,et al.  Physiological and evolutionary studies of NAP systems in Shewanella piezotolerans WP3 , 2011, The ISME Journal.

[95]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[96]  M C Peitsch,et al.  ProMod and Swiss-Model: Internet-based tools for automated comparative protein modelling. , 1996, Biochemical Society transactions.

[97]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[98]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[99]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[100]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[101]  Dieter Jahn,et al.  Virtual Footprint and PRODORIC: an integrative framework for regulon prediction in prokaryotes , 2005, Bioinform..

[102]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[103]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[104]  Alfonso Valencia,et al.  Early bioinformatics: the birth of a discipline - a personal view , 2003, Bioinform..

[105]  Bonnie Berger,et al.  iWRAP: An interface threading approach with application to prediction of cancer-related protein-protein interactions. , 2010, Journal of molecular biology.

[106]  Shahab Asgharzadeh,et al.  TARGETgene: A Tool for Identification of Potential Therapeutic Targets in Cancer , 2012, PloS one.

[107]  Dariusz Plewczynski,et al.  Protein-protein interaction and pathway databases, a graphical review , 2011, Briefings Bioinform..

[108]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[109]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[110]  Ron D. Appel,et al.  ExPASy: the proteomics server for in-depth protein knowledge and analysis , 2003, Nucleic Acids Res..

[111]  Gang Wu,et al.  MIMO: an efficient tool for molecular interaction maps overlap , 2013, BMC Bioinformatics.

[112]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[113]  Xiaomin Luo,et al.  PDTD: a web-accessible protein database for drug target identification , 2008, BMC Bioinformatics.

[114]  Ney Lemke,et al.  The Development of a Universal In Silico Predictor of Protein-Protein Interactions , 2013, PloS one.

[115]  T. M. Mohan,et al.  Computer-Aided Drug Design for Cancer-Causing H-Ras p21 Mutant Protein , 2009 .

[116]  Xiang Xiao,et al.  Molecular characterization of the modular chitin binding protein Cbp50 from Bacillus thuringiensis serovar konkukian , 2011, Antonie van Leeuwenhoek.

[117]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[118]  F. Bast Sequence similarity search, Multiple Sequence Alignment, Model Selection, Distance Matrix and Phylogeny Reconstruction , 2013 .

[119]  Eric M. Just,et al.  dictyBase update 2011: web 2.0 functionality and the initial steps towards a genome portal for the Amoebozoa , 2010, Nucleic Acids Res..

[120]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[121]  Gautier Koscielny,et al.  Ensembl 2012 , 2011, Nucleic Acids Res..

[122]  James A. Evans,et al.  Novel opportunities for computational biology and sociology in drug discovery. , 2010, Trends in biotechnology.

[123]  David A. Lee,et al.  New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures , 2012, Nucleic Acids Res..

[124]  Milton H. Saier,et al.  The Transporter Classification Database , 2013, Nucleic Acids Res..

[125]  Erik Segerdell,et al.  Xenbase: gene expression and improved integration , 2009, Nucleic Acids Res..

[126]  Laura Ponting,et al.  FlyBase 102—advanced approaches to interrogating FlyBase , 2013, Nucleic Acids Res..

[127]  Rolf Apweiler,et al.  InterProScan: protein domains identifier , 2005, Nucleic Acids Res..

[128]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[129]  Robert S. Ledley,et al.  The Protein Information Resource , 2003, Nucleic Acids Res..

[130]  V. Nagaraja,et al.  Conserved economics of transcription termination in eubacteria. , 2002, Nucleic acids research.

[131]  Douglas L. Brutlag,et al.  The EMOTIF database , 2001, Nucleic Acids Res..

[132]  Sudhir Kumar,et al.  MEGA: Molecular Evolutionary Genetics Analysis software for microcomputers , 1994, Comput. Appl. Biosci..

[133]  Yanhui Hu,et al.  Integrating protein-protein interaction networks with phenotypes reveals signs of interactions , 2013, Nature Methods.

[134]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[135]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[136]  S. Khalid,et al.  Medherb: An Interactive Bioinformatics Database and Analysis Resource for Medicinally Important Herbs , 2014 .

[137]  Long Gao,et al.  Multi-Analyte Network Markers for Tumor Prognosis , 2012, PloS one.

[138]  Alan Bridge,et al.  New and continuing developments at PROSITE , 2012, Nucleic Acids Res..

[139]  Frances M. G. Pearl,et al.  The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis , 2004, Nucleic Acids Res..

[140]  Milton H. Saier,et al.  TCDB: the Transporter Classification Database for membrane transport protein analyses and information , 2005, Nucleic Acids Res..

[141]  K. Chou,et al.  Protein subcellular location prediction. , 1999, Protein engineering.

[142]  政美 長谷川,et al.  Molphy, programs for molecular phylogenetics, I : protml, maximum likelihood inference of protein phylogeny , 1992 .

[143]  D. Higgins,et al.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.

[144]  Hui Lu,et al.  MULTIPROSPECTOR: An algorithm for the prediction of protein–protein interactions by multimeric threading , 2002, Proteins.

[145]  A. Butt,et al.  Homology modeling, comparative genomics and functional annotation of Mycoplasma genitalium hypothetical protein MG_237 , 2011, Bioinformation.

[146]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[147]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[148]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[149]  A. Child,et al.  Sequence analysis and homology modeling suggest that primary congenital glaucoma on 2p21 results from mutations disrupting either the hinge region or the conserved core structures of cytochrome P4501B1. , 1998, American journal of human genetics.

[150]  X. Chen,et al.  TTD: Therapeutic Target Database , 2002, Nucleic Acids Res..