Prediction of disorder with new computational tool: BVDEA

Recognizing that many intrinsically disordered regions in proteins play key roles in vital functions and also in some diseases, identification of the disordered regions has became a demanding process for structure prediction and functional characterization of proteins. Therefore, many studies have been motivated on accurate prediction of disorder. Mostly, machine learning techniques have been used for dealing with the prediction problem of disorder due to the capability of extracting the complex relationships and correlations hidden in large data sets. In this study, a novel method, named Border Vector Detection and Extended Adaptation (BVDEA) was developed for predicting disorder as an alternative accurate classifier. The classifier performs the predictions by using three types of structural features belonging to proteins. For attesting the performance of the method, three computational learning techniques and eleven specific tools were used for comparison. Training was executed based on the data by 5-fold cross validation. When compared with the two learning methods of LVQ and BVDA, the proposed method gives the best success on classification. The BVDEA also provides faster and more robust learning as compared to the others. The new method provides a significant contribution to predicting disorder and order regions of proteins.

[1]  Yoichi Muraoka,et al.  Predicting mostly disordered proteins by using structure-unknown protein data , 2007, BMC Bioinform..

[2]  A K Dunker,et al.  Comparing predictors of disordered protein. , 2000, Genome informatics. Workshop on Genome Informatics.

[3]  V. Uversky Intrinsically Disordered Proteins , 2000 .

[4]  Robert B. Russell,et al.  GlobPlot: exploring protein sequences for globularity and disorder , 2003, Nucleic Acids Res..

[5]  Zoran Obradovic,et al.  Optimizing Long Intrinsic Disorder Predictors with Protein Evolutionary Information , 2005, J. Bioinform. Comput. Biol..

[6]  Jianlin Cheng,et al.  Protein disorder prediction at multiple levels of sensitivity and specificity , 2008, BMC Genomics.

[7]  Lixiao Wang,et al.  OnD-CRF: prediciting order and disorder in proteins conditional random fields , 2008, Bioinform..

[8]  A.K. Dunker,et al.  Identifying disordered regions in proteins from amino acid sequence , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[9]  Lukasz A. Kurgan,et al.  Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources , 2010, Bioinform..

[10]  P. Tompa Intrinsically unstructured proteins. , 2002, Trends in biochemical sciences.

[11]  Avner Schlessinger,et al.  Improved Disorder Prediction by Combination of Orthogonal Approaches , 2009, PloS one.

[12]  Zoran Obradovic,et al.  The Protein Non-Folding Problem: Amino Acid Determinants of Intrinsic Order and Disorder , 2000, Pacific Symposium on Biocomputing.

[13]  T. Gibson,et al.  Protein disorder prediction: implications for structural proteomics. , 2003, Structure.

[14]  Zoran Obradovic,et al.  Predicting intrinsic disorder from amino acid sequence , 2003, Proteins.

[15]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[16]  E. Fischer Einfluss der Configuration auf die Wirkung der Enzyme , 1894 .

[17]  Ikuko Nishikawa,et al.  Computational Prediction of O-linked Glycosylation Sites That Preferentially Map on Intrinsically Disordered Regions of Extracellular Proteins , 2010, International journal of molecular sciences.

[18]  Jaime Prilusky,et al.  FoldIndex copyright: a simple tool to predict whether a given protein sequence is intrinsically unfolded , 2005, Bioinform..

[19]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[20]  Michail Yu. Lobanov,et al.  Prediction of Amyloidogenic and Disordered Regions in Protein Chains , 2006, PLoS Comput. Biol..

[21]  Anne Poupon,et al.  Prediction of unfolded segments in a protein sequence based on amino acid composition , 2005, Bioinform..

[22]  P. Tompa,et al.  The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. , 2005, Journal of molecular biology.

[23]  H. Dyson,et al.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. , 1999, Journal of molecular biology.

[24]  David T. Jones,et al.  Prediction of disordered regions in proteins from position specific score matrices , 2003, Proteins.

[25]  Zheng Rong Yang,et al.  RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins , 2005, Bioinform..

[26]  P. Radivojac,et al.  Protein flexibility and intrinsic disorder , 2004, Protein science : a publication of the Protein Society.

[27]  J. Beckmann,et al.  FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. , 2005, Bioinformatics.

[28]  C. Brown,et al.  Intrinsic protein disorder in complete genomes. , 2000, Genome informatics. Workshop on Genome Informatics.

[29]  V. Uversky,et al.  Why are “natively unfolded” proteins unstructured under physiologic conditions? , 2000, Proteins.

[30]  Okan K. Ersoy,et al.  Border Vector Detection and Adaptation for Classification of Multispectral and Hyperspectral Remote Sensing Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[31]  Cathy H. Wu,et al.  Neural networks and genome informatics , 2000 .

[32]  A. Mirsky,et al.  On the Structure of Native, Denatured, and Coagulated Proteins. , 1936, Proceedings of the National Academy of Sciences of the United States of America.

[33]  P. Romero,et al.  Sequence complexity of disordered protein , 2001, Proteins.

[34]  A Keith Dunker,et al.  Order, disorder, and flexibility: prediction from protein sequence. , 2003, Structure.

[35]  Pierre Baldi,et al.  Accurate Prediction of Protein Disordered Regions by Mining Protein Structure Data , 2005, Data Mining and Knowledge Discovery.

[36]  M. Y. Lobanov,et al.  To be folded or to be unfolded? , 2004, Protein science : a publication of the Protein Society.

[37]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[38]  John C. Wootton,et al.  Statistics of Local Complexity in Amino Acid Sequences and Sequence Databases , 1993, Comput. Chem..

[39]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[40]  Obradovic,et al.  Predicting Protein Disorder for N-, C-, and Internal Regions. , 1999, Genome informatics. Workshop on Genome Informatics.

[41]  Debashis Mukhopadhyay,et al.  The Role of Intrinsically Unstructured Proteins in Neurodegenerative Diseases , 2009, PloS one.

[42]  Bernard F. Buxton,et al.  The DISOPRED server for the prediction of protein disorder , 2004, Bioinform..

[43]  Yu-Yen Ou,et al.  Protein disorder prediction by condensed PSSM considering propensity for order or disorder , 2006, BMC Bioinformatics.

[44]  D. Koshland Application of a Theory of Enzyme Specificity to Protein Synthesis. , 1958, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Gary D Bader,et al.  Bringing order to protein disorder through comparative genomics and genetic interactions , 2011, Genome Biology.

[46]  Lixiao Wang,et al.  OnD-CRF: predicting order and disorder in proteins conditional random fields , 2008, Bioinform..

[47]  Christopher J. Oldfield,et al.  The unfoldomics decade: an update on intrinsically disordered proteins , 2008, BMC Genomics.

[48]  Yoichi Muraoka,et al.  Predicting the protein disordered region using modified position specific scoring matrix , 2004 .

[49]  John Moult,et al.  Evaluation of disorder predictions in CASP5 , 2003, Proteins.

[50]  J. Wootton,et al.  Statistics of Local Complexity in Amino Acid Sequences and Sequence Databases , 1993, Comput. Chem..

[51]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[52]  J. Hoh,et al.  Reduced amino acid alphabet is sufficient to accurately recognize intrinsically disordered protein , 2004, FEBS letters.

[53]  S. Vucetic,et al.  Flavors of protein disorder , 2003, Proteins.