Studies on the rules of β-strand alignment in a protein β-sheet structure.

To further disclose the underlying mechanisms of protein β-sheet formation, studies were made on the rules of β-strands alignment forming β-sheet structure using statistical and machine learning approaches. Firstly, statistical analysis was performed on the sum of β-strands between each β-strand pairs in protein sequences. The results showed a propensity of near-neighbor pairing (or called "first come first pair") in the β-strand pairs. Secondly, based on the same dataset, the pairwise cross-combinations of real β-strand pairs and four pseudo-β-strand contained pairs were classified by support vector machine (SVM). A novel feature extracting approach was designed for classification using the average amino acid pairing encoding matrix (APEM). Analytical results of the classification indicated that a segment of β-strand had the ability to distinguish β-strands from segments of α-helix and coil. However, the result also showed that a β-strand was not strongly conserved to choose its real partner from all the alternative β-strand partners, which was corresponding with the ordination results of the statistical analysis each other. Thus, the rules of "first come first pair" propensity and the non-conservative ability to choose real partner, were possible important factors affecting the β-strands alignment forming β-sheet structures.

[1]  Hao Lin,et al.  Prediction of cell wall lytic enzymes using Chou's amphiphilic pseudo amino acid composition. , 2009, Protein and peptide letters.

[2]  J M Sturtevant,et al.  Sidechain interactions in parallel beta sheets: the energetics of cross-strand pairings. , 1999, Structure.

[3]  Pierre Boullier,et al.  Range Concatenation Grammars , 2000, IWPT.

[4]  Kuo-Chen Chou,et al.  Prediction of Protein Structural Classes by Support Vector Machines , 2002, Comput. Chem..

[5]  Acr Martin,et al.  Amino Acid Pairing Preferences in Parallel β-Sheets in Proteins , 2006 .

[6]  Márcio Dorn,et al.  A3N: An artificial neural network n-gram-based method to approximate 3-D polypeptides structure prediction , 2010, Expert Syst. Appl..

[7]  J. Thornton,et al.  Determinants of strand register in antiparallel β‐sheets of proteins , 1998, Protein science : a publication of the Protein Society.

[8]  De-Shuang Huang,et al.  Combining a binary input encoding scheme with RBFNN for globulin protein inter-residue contact map prediction , 2005, Pattern Recognit. Lett..

[9]  J. Thornton,et al.  Prediction of strand pairing in antiparallel and parallel β‐sheets using information theory , 2002, Proteins.

[10]  Pierre Baldi,et al.  Improved residue contact prediction using support vector machines and a large feature set , 2007, BMC Bioinformatics.

[11]  Jens Meiler,et al.  Strand‐loop‐strand motifs: Prediction of hairpins and diverging turns in proteins , 2004, Proteins.

[12]  David B. Searls,et al.  Grammatical Representations of Macromolecular Structure , 2006, J. Comput. Biol..

[13]  Tatsuya Akutsu,et al.  Dynamic Programming Algorithms and Grammatical Modeling for Protein Beta-Sheet Prediction , 2009, J. Comput. Biol..

[14]  Hassan Mohabatkar,et al.  Prediction of cyclin proteins using Chou's pseudo amino acid composition. , 2010, Protein and peptide letters.

[15]  Irena Roterman-Konieczna,et al.  An Efficient Multi-class Support Vector Machine Classifier for Protein Fold Recognition , 2010, IWPACBB.

[16]  Hui Ding,et al.  Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. , 2011, Journal of theoretical biology.

[17]  Hao Lin The modified Mahalanobis Discriminant for predicting outer membrane proteins by using Chou's pseudo amino acid composition. , 2008, Journal of theoretical biology.

[18]  K. Chou,et al.  Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features , 2010, PloS one.

[19]  Jishou Ruan,et al.  The interstrand amino acid pairs play a significant role in determining the parallel or antiparallel orientation of beta-strands. , 2009, Biochemical and biophysical research communications.

[20]  H. Scheraga,et al.  Structure of beta-sheets. Origin of the right-handed twist and of the increased stability of antiparallel over parallel sheets. , 1982, Journal of molecular biology.

[21]  T. Gibson,et al.  Protein disorder prediction: implications for structural proteomics. , 2003, Structure.

[22]  Tao Zhang,et al.  SHEETSPAIR: A Database of Amino Acid Pairs in Protein Sheet Structures , 2007, Data Sci. J..

[23]  Xin Ma,et al.  Prediction of RNA‐binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature , 2011, Proteins.

[24]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[25]  Brent Wathen,et al.  Protein β-Sheet Nucleation Is Driven by Local Modular Formation* , 2010, The Journal of Biological Chemistry.

[26]  Ganesan Pugalenthi,et al.  Predicting protein structural class by SVM with class-wise optimized features and decision probabilities. , 2008, Journal of theoretical biology.

[27]  S. Hua,et al.  A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. , 2001, Journal of molecular biology.

[28]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[29]  Tao Zhang,et al.  Prediction of the parallel/antiparallel orientation of beta-strands using amino acid pairing preferences and support vector machines. , 2010, Journal of theoretical biology.

[30]  Minoru Asogawa,et al.  Beta-Sheet Prediction Using Inter-Strand Residue Pairs and Refinement with Hopfield Neural Network , 1997, ISMB.

[31]  Sonia Longhi,et al.  A practical overview of protein disorder prediction methods , 2006, Proteins.

[32]  H A Scheraga,et al.  Folding of the twisted beta-sheet in bovine pancreatic trypsin inhibitor. , 1985, Biochemistry.

[33]  Asifullah Khan,et al.  Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition. , 2011, Journal of theoretical biology.

[34]  P. Suganthan,et al.  AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties. , 2011, Journal of theoretical biology.

[35]  Xiaolong Wang,et al.  Protein Long Disordered Region Prediction Based on Profile-Level Disorder Propensities and Position-Specific Scoring Matrixes , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine.

[36]  L. Gregoret,et al.  Context-dependence of Amino Acid Residue Pairing in Antiparallel β-She?ets , 1999 .

[37]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[38]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[39]  Lei Chen,et al.  Prediction of interactiveness between small molecules and enzymes by combining gene ontology and compound similarity , 2009, J. Comput. Chem..

[40]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[41]  Pierre Baldi,et al.  Matching Protein b-Sheet Partners by Feedforward and Recurrent Neural Networks , 2000, ISMB.

[42]  Yael Mandel-Gutfreund,et al.  Contributions of residue pairing to β-sheet formation:conservation and covariation of amino acid residue pairs on antiparallel β-strands 1 1 Edited by J. Thornton , 2001 .

[43]  Pierre Baldi,et al.  Three-stage prediction of protein ?-sheets by neural networks, alignments and graph algorithms , 2005, ISMB.

[44]  K. Chou,et al.  REVIEW : Recent advances in developing web-servers for predicting protein attributes , 2009 .

[45]  C. Sander,et al.  Specific recognition in the tertiary structure of β-sheets of proteins , 1980 .

[46]  Lei Chen,et al.  Identifying protein complexes using hybrid properties. , 2009, Journal of proteome research.

[47]  Shao-Chun Jia,et al.  Using random forest algorithm to predict β-hairpin motifs. , 2011, Protein and peptide letters.

[48]  Pierre Baldi,et al.  ICBS: a database of interactions between protein chains mediated by ?-sheet formation , 2004, Bioinform..

[49]  Lin Lu,et al.  HIV‐1 protease cleavage site prediction based on amino acid property , 2009, J. Comput. Chem..

[50]  Kuo-Chen Chou,et al.  Interactions between two -sheets energetics of / packing in proteins , 1986 .

[51]  T.J.P. Hubbard,et al.  Use of /spl beta/-strand interaction pseudo-potentials in protein structure prediction and modelling , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[52]  Kuo-Chen Chou,et al.  Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition , 2010, BMC Bioinformatics.

[53]  K. Chou,et al.  iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins , 2011, PloS one.

[54]  P Rotkiewicz,et al.  Generalized comparative modeling (GENECOMP): A combination of sequence comparison, threading, and lattice modeling for protein structure prediction and refinement , 2001, Proteins.

[55]  M. A. Wouters,et al.  An analysis of side chain interactions and pair correlations within antiparallel β‐sheets: The differences between backbone hydrogen‐bonded and non‐hydrogen‐bonded residue pairs , 1995, Proteins.

[56]  Tao Huang,et al.  Prediction of Pharmacological and Xenobiotic Responses to Drugs Based on Time Course Gene Expression Profiles , 2009, PloS one.

[57]  Lin Lu,et al.  A novel computational approach to predict transcription factor DNA binding preference. , 2009, Journal of proteome research.

[58]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[59]  Guoli Wang,et al.  PISCES: recent improvements to a PDB sequence culling server , 2005, Nucleic Acids Res..