Neural Networks in Bioinformatics

Over the last two decades, neural networks (NNs) gradually became one of the indispensable tools in bioinformatics. This was fueled by the development and rapid growth of numerous biological databases that store data concerning DNA and RNA sequences, protein sequences and structures, and other macromolecular structures. The size and complexity of these data require the use of advanced computational tools. Computational analysis of these databases aims at exposing hidden information that provides insights which help with understanding the underlying biological principles. The most commonly explored capability of neural networks that is exploited in the context of bioinformatics is prediction. This is due to the existence of a large body of raw data and the availability of a limited amount of data that are annotated and can be used to derive the prediction model. In this chapter we discuss and summarize applications of neural networks in bioinformatics, with a particular focus on applications in protein bioinformatics. We summarize the most often used neural network architectures, and discuss several specific applications including prediction of protein secondary structure, solvent accessibility, and binding residues.

[1]  Kuo-Chen Chou,et al.  Prediction of protein secondary structure content by artificial neural network , 2003, J. Comput. Chem..

[2]  Zheng Rong Yang,et al.  Bio-basis function neural network for prediction of protease cleavage sites in proteins , 2005, IEEE Transactions on Neural Networks.

[3]  Burkhard Rost,et al.  Protein–Protein Interaction Hotspots Carved into Sequences , 2007, PLoS Comput. Biol..

[4]  J M Chandonia,et al.  Neural networks for secondary structure and structural class predictions , 1995, Protein science : a publication of the Protein Society.

[5]  Max Dobler,et al.  Multi-dimensional QSAR in drug research , 2000 .

[6]  George Kesidis,et al.  Emergent unsupervised clustering paradigms with potential application to bioinformatics. , 2008, Frontiers in bioscience : a journal and virtual library.

[7]  Dmitrij Frishman,et al.  Prediction of beta-turns and beta-turn types by a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN). , 2008, Gene.

[8]  Shandar Ahmad,et al.  PSSM-based prediction of DNA binding sites in proteins , 2005, BMC Bioinformatics.

[9]  Yaoqi Zhou,et al.  Real‐SPINE: An integrated system of neural networks for real‐value prediction of protein structural properties , 2007, Proteins.

[10]  Ram Samudrala,et al.  PROTINFO: secondary and tertiary protein structure prediction , 2003, Nucleic Acids Res..

[11]  Harpreet Kaur,et al.  Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure , 2005, Proteins.

[12]  Nikolaj Blom,et al.  NetPhosYeast: prediction of protein phosphorylation sites in yeast , 2007, Bioinform..

[13]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[14]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[15]  Seungwoo Hwang,et al.  Using evolutionary and structural information to predict DNA‐binding sites on DNA‐binding proteins , 2006, Proteins.

[16]  O. Lund,et al.  Prediction of protein secondary structure at 80% accuracy , 2000, Proteins.

[17]  Dariusz Plewczynski,et al.  Prediction of signal peptides in protein sequences by neural networks. , 2008, Acta biochimica Polonica.

[18]  Piero Fariselli,et al.  Prediction of disulfide‐bonded cysteines in proteomes with a hidden neural network , 2004, Proteomics.

[19]  An-Suei Yang,et al.  Protein backbone angle prediction with machine learning approaches , 2004, Bioinform..

[20]  N. Blom,et al.  Cleavage site analysis in picornaviral polyproteins: Discovering cellular targets by neural networks , 1996, Protein science : a publication of the Protein Society.

[21]  R. Casadio,et al.  Prediction of the transmembrane regions of β‐barrel membrane proteins with a neural network‐based predictor , 2001, Protein science : a publication of the Protein Society.

[22]  Chartchalerm Isarankura-Na-Ayudhya,et al.  Prediction of GFP spectral properties using artificial neural network , 2007, J. Comput. Chem..

[23]  Gajendra P. S. Raghava,et al.  A neural‐network based method for prediction of γ‐turns in proteins from multiple sequence alignment , 2003 .

[24]  Aleksey A. Porollo,et al.  Combining prediction of secondary structure and solvent accessibility in proteins , 2005, Proteins.

[25]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[26]  D. Frishman,et al.  Prediction of helix–helix contacts and interacting helices in polytopic membrane proteins using neural networks , 2009, Proteins.

[27]  N M Luscombe,et al.  What is Bioinformatics? A Proposed Definition and Overview of the Field , 2001, Methods of Information in Medicine.

[28]  Zheng Rong Yang,et al.  Prediction of Signal Peptides Using Bio-Basis Function Neural Networks and Decision Trees , 2006, Applied bioinformatics.

[29]  Y Cai,et al.  Prediction of protein structural classes by neural network. , 2000, Biochimie.

[30]  Lukasz A. Kurgan,et al.  Highly accurate and consistent method for prediction of helix and strand content from primary protein sequences , 2005, Artif. Intell. Medicine.

[31]  Gisbert Schneider,et al.  Support vector machine applications in bioinformatics. , 2003, Applied bioinformatics.

[32]  Satoru Miyano,et al.  A neural network method for identification of RNA-interacting residues in protein. , 2004, Genome informatics. International Conference on Genome Informatics.

[33]  Burkhard Rost,et al.  PHD - an automatic mail server for protein secondary structure prediction , 1994, Comput. Appl. Biosci..

[34]  A. Vedani,et al.  Multi-dimensional QSAR in drug research. Predicting binding affinities, toxicity and pharmacokinetic parameters. , 2000, Progress in drug research. Fortschritte der Arzneimittelforschung. Progres des recherches pharmaceutiques.

[35]  Bin Xue,et al.  Real‐value prediction of backbone torsion angles , 2008, Proteins.

[36]  Jinmiao Chen,et al.  Cascaded Bidirectional Recurrent Neural Networks for Protein Secondary Structure Prediction , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[37]  Huan‐Xiang Zhou,et al.  Prediction of protein interaction sites from sequence profile and residue neighbor list , 2001, Proteins.

[38]  Shandar Ahmad,et al.  TMBETA-NET: discrimination and prediction of membrane spanning β-strands in outer membrane proteins , 2005, Nucleic Acids Res..

[39]  Sudipto Saha,et al.  Prediction of continuous B‐cell epitopes in an antigen using recurrent neural network , 2006, Proteins.

[40]  Björn Olsson,et al.  Artificial intelligence techniques for bioinformatics. , 2002, Applied bioinformatics.

[41]  Kuo-Chen Chou,et al.  Artificial Neural Network Model for Predicting Protein Subcellular Location , 2002, Comput. Chem..

[42]  Darren J. Wilkinson,et al.  Bayesian methods in bioinformatics and computational systems biology , 2006, Briefings Bioinform..

[43]  M. Gromiha,et al.  Real value prediction of solvent accessibility from amino acid sequence , 2003, Proteins.

[44]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[45]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[46]  Mark S. Boguski,et al.  Bioinformatics–a new era , 1998 .

[47]  Zhengzhi Wang,et al.  Prediction of subcellular localization of eukaryotic proteins using position-specific profiles and neural network with weighted inputs. , 2007, Journal of genetics and genomics = Yi chuan xue bao.

[48]  T. Niwa Prediction of biological targets using probabilistic neural networks and atom-type descriptors. , 2004, Journal of medicinal chemistry.

[49]  Chin-Teng Lin,et al.  Protein Metal Binding Residue Prediction Based on Neural Networks , 2004, ICONIP.

[50]  Yaoqi Zhou,et al.  Achieving 80% ten‐fold cross‐validated accuracy for secondary structure prediction by large‐scale training , 2006, Proteins.

[51]  B. Rost,et al.  Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[52]  Shandar Ahmad,et al.  Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information , 2004, Bioinform..

[53]  Harpreet Kaur,et al.  Prediction of transmembrane regions of beta-barrel proteins using ANN- and SVM-based methods. , 2004, Proteins.

[54]  De-Shuang Huang,et al.  Prediction of inter-residue contacts map based on genetic algorithm optimized radial basis function neural network and binary input encoding scheme , 2004, J. Comput. Aided Mol. Des..

[55]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[56]  G. Izmirlian,et al.  Overview of Commonly Used Bioinformatics Methods and Their Applications , 2004, Annals of the New York Academy of Sciences.

[57]  Gary B. Fogel,et al.  Computational intelligence approaches for pattern discovery in biological systems , 2008, Briefings Bioinform..

[58]  S H Kim,et al.  Predicting protein secondary structure content. A tandem neural network approach. , 1992, Journal of molecular biology.

[59]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[60]  T. Hubbard,et al.  Using neural networks for prediction of the subcellular location of proteins. , 1998, Nucleic acids research.

[61]  Morten Nielsen,et al.  NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11 , 2008, Nucleic Acids Res..

[62]  Gajendra P. S. Raghava,et al.  A neural network method for prediction of ?-turn types in proteins using evolutionary information , 2004, Bioinform..

[63]  Zhaohui Wu,et al.  Sequence‐based protein domain boundary prediction using BP neural network with various property profiles , 2008, Proteins.

[64]  P. Baldi,et al.  Prediction of coordination number and relative solvent accessibility in proteins , 2002, Proteins.