Prediction of disordered regions in proteins based on the meta approach

MOTIVATION Intrinsically disordered regions in proteins have no unique stable structures without their partner molecules, thus these regions sometimes prevent high-quality structure determination. Furthermore, proteins with disordered regions are often involved in important biological processes, and the disordered regions are considered to play important roles in molecular interactions. Therefore, identifying disordered regions is important to obtain high-resolution structural information and to understand the functional aspects of these proteins. RESULTS We developed a new prediction method for disordered regions in proteins based on the meta approach and implemented a web-server for this prediction method named 'metaPrDOS'. The method predicts the disorder tendency of each residue using support vector machines from the prediction results of the seven independent predictors. Evaluation of the meta approach was performed using the CASP7 prediction targets to avoid an overestimation due to the inclusion of proteins used in the training set of some component predictors. As a result, the meta approach achieved higher prediction accuracy than all methods participating in CASP7.

[1]  I. Song,et al.  Working Set Selection Using Second Order Information for Training Svm, " Complexity-reduced Scheme for Feature Extraction with Linear Discriminant Analysis , 2022 .

[2]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[3]  Jaime Prilusky,et al.  FoldIndex copyright: a simple tool to predict whether a given protein sequence is intrinsically unfolded , 2005, Bioinform..

[4]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[5]  Torsten Schwede,et al.  Assessment of disorder predictions in CASP7 , 2007, Proteins.

[6]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[7]  Arne Elofsson,et al.  Structure prediction meta server , 2001, Bioinform..

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  Christopher J. Oldfield,et al.  Intrinsically disordered protein. , 2001, Journal of molecular graphics & modelling.

[10]  Roland L. Dunbrack,et al.  CAFASP2: The second critical assessment of fully automated structure prediction methods , 2001, Proteins.

[11]  Zsuzsanna Dosztányi,et al.  IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content , 2005, Bioinform..

[12]  David T. Jones,et al.  Prediction of disordered regions in proteins from position specific score matrices , 2003, Proteins.

[13]  V. Uversky Intrinsically Disordered Proteins , 2000 .

[14]  Shuichi Hirose,et al.  BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm330 Structural bioinformatics , 2022 .

[15]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[16]  A Keith Dunker,et al.  Intrinsic disorder and protein function. , 2002, Biochemistry.

[17]  A Keith Dunker,et al.  Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[18]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[19]  H. Dyson,et al.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. , 1999, Journal of molecular biology.

[20]  H. Dyson,et al.  Coupling of folding and binding for unstructured proteins. , 2002, Current opinion in structural biology.

[21]  John Moult,et al.  Evaluation of disorder predictions in CASP5 , 2003, Proteins.

[22]  Christopher J. Oldfield,et al.  Addressing the intrinsic disorder bottleneck in structural proteomics , 2005, Proteins.

[23]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[24]  Harpreet Kaur Saini,et al.  BIOINFORMATICS APPLICATIONS NOTE Structural bioinformatics Meta-DP: domain prediction meta-server , 2022 .

[25]  P. Tompa,et al.  The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. , 2005, Journal of molecular biology.

[26]  L. Iakoucheva,et al.  Intrinsic Disorder and Protein Function , 2002 .

[27]  Christopher J. Oldfield,et al.  Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling , 2005, Journal of molecular recognition : JMR.

[28]  V. Uversky Natively unfolded proteins: A point where biology waits for physics , 2002, Protein science : a publication of the Protein Society.

[29]  T. Gibson,et al.  Protein disorder prediction: implications for structural proteomics. , 2003, Structure.

[30]  Pierre Baldi,et al.  Accurate Prediction of Protein Disordered Regions by Mining Protein Structure Data , 2005, Data Mining and Knowledge Discovery.

[31]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[32]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[33]  Sonia Longhi,et al.  A practical overview of protein disorder prediction methods , 2006, Proteins.

[34]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[35]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[36]  Kengo Kinoshita,et al.  PrDOS: prediction of disordered protein regions from amino acid sequence , 2007, Nucleic Acids Res..

[37]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[38]  D Fischer,et al.  LiveBench‐2: Large‐scale automated evaluation of protein structure prediction servers , 2001, Proteins.

[39]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[40]  J. Beckmann,et al.  FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. , 2005, Bioinformatics.

[41]  Guoli Wang,et al.  PISCES: recent improvements to a PDB sequence culling server , 2005, Nucleic Acids Res..