论文信息 - SpotOn: High Accuracy Identification of Protein-Protein Interface Hot-Spots

SpotOn: High Accuracy Identification of Protein-Protein Interface Hot-Spots

We present SpotOn, a web server to identify and classify interfacial residues as Hot-Spots (HS) and Null-Spots (NS). SpotON implements a robust algorithm with a demonstrated accuracy of 0.95 and sensitivity of 0.98 on an independent test set. The predictor was developed using an ensemble machine learning approach with up-sampling of the minor class. It was trained on 53 complexes using various features, based on both protein 3D structure and sequence. The SpotOn web interface is freely available at: http://milou.science.uu.nl/services/SPOTON/.

[1] Hui Ding,et al. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. , 2011, Journal of theoretical biology.

[2] Julie C. Mitchell,et al. KFC2: A knowledge‐based hot spot prediction method based on interface solvation, atomic density, and plasticity features , 2011, Proteins.

[3] Wei Chen,et al. RAMPred: identifying the N1-methyladenosine sites in eukaryotic transcriptomes , 2016, Scientific Reports.

[4] Gerard J. P. van Westen,et al. Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targets , 2011 .

[5] Wei Chen,et al. Identifying RNA 5-methylcytosine sites via pseudo nucleotide compositions. , 2016, Molecular bioSystems.

[6] Giorgio Valentini,et al. Ensembles of Learning Machines , 2002, WIRN.

[7] Alexander S. Rose,et al. NGL Viewer: a web application for molecular visualization , 2015, Nucleic Acids Res..

[8] Wei Chen,et al. iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition , 2016, Oncotarget.

[9] M. Natália D. S. Cordeiro,et al. Solvent Accessible Surface Area-Based Hot-Spot Detection Methods for Protein-Protein and Protein-Nucleic Acid Interfaces , 2015, J. Chem. Inf. Model..

[10] Max Kuhn,et al. Building Predictive Models in R Using the caret Package , 2008 .

[11] Hao Lin,et al. Prediction of cell wall lytic enzymes using Chou's amphiphilic pseudo amino acid composition. , 2009, Protein and peptide letters.

[12] Hui Ding,et al. Identify Golgi protein types with modified Mahalanobis discriminant algorithm and pseudo amino acid composition. , 2011, Protein and peptide letters.

[13] J. Martins,et al. Solvent‐accessible surface area: How well can be applied to hot‐spot detection? , 2014, Proteins.

[14] Dong-Sheng Cao,et al. protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences , 2015, Bioinform..

[15] Irina S Moreira,et al. Computational Alanine Scanning Mutagenesis-An Improved Methodological Approach for Protein-DNA Complexes. , 2013, Journal of chemical theory and computation.

[16] Pufeng Du,et al. PseAAC-General: Fast Building Various Modes of General Form of Chou’s Pseudo-Amino Acid Composition for Large-Scale Protein Datasets , 2014, International journal of molecular sciences.

[17] Peter Dalgaard,et al. R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[18] Ovidiu Ivanciuc,et al. Chemical graphs, molecular matrices and topological indices in chemoinformatics and quantitative structure-activity relationships. , 2013, Current computer-aided drug design.

[19] K. Chou,et al. iCTX-Type: A Sequence-Based Predictor for Identifying the Types of Conotoxins in Targeting Ion Channels , 2014, BioMed research international.

[20] Hao Lin,et al. Identifying Sigma70 Promoters with Novel Pseudo Nucleotide Composition , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21] H. Ding,et al. Identification of mitochondrial proteins of malaria parasite using analysis of variance , 2014, Amino Acids.

[22] Irina S Moreira. The Role of Water Occlusion for the Definition of a Protein Binding Hot-Spot. , 2015, Current topics in medicinal chemistry.

[23] William R Pearson,et al. BLAST and FASTA similarity searching for multiple sequence alignment. , 2014, Methods in molecular biology.

[24] Alexandre M. J. J. Bonvin,et al. CPORT: A Consensus Interface Predictor and Its Performance in Prediction-Driven Docking with HADDOCK , 2011, PloS one.

[25] K. Chou. Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[26] Hao Lin. The modified Mahalanobis Discriminant for predicting outer membrane proteins by using Chou's pseudo amino acid composition. , 2008, Journal of theoretical biology.

[27] M. Michael Gromiha,et al. PINT: Protein–protein Interactions Thermodynamic Database , 2005, Nucleic Acids Res..

[28] Vasant G Honavar,et al. Computational prediction of protein interfaces: A review of data driven methods , 2015, FEBS letters.

[29] D. Bailey,et al. The Binding Interface Database (BID): A Compilation of Amino Acid Hot Spots in Protein Interfaces , 2003, Bioinform..

[30] Irina S. Moreira,et al. A Machine Learning Approach for Hot-Spot Detection at Protein-Protein Interfaces , 2016, International journal of molecular sciences.

[31] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.

[32] K Schulten,et al. VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[33] Hui Ding,et al. AcalPred: A Sequence-Based Tool for Discriminating between Acidic and Alkaline Enzymes , 2013, PloS one.

[34] Jan Tavernier,et al. Modulation of Protein–Protein Interactions for the Development of Novel Therapeutics , 2015, Molecular therapy : the journal of the American Society of Gene Therapy.

[35] G. Marius Clore,et al. Refined solution structure of the oligomerization domain of the tumour suppressor p53 , 1995, Nature Structural Biology.

[36] Wei Chen,et al. PAI: Predicting adenosine to inosine editing sites by using pseudo nucleotide compositions , 2016, Scientific Reports.

[37] T. Clackson,et al. A hot spot of binding energy in a hormone-receptor interface , 1995, Science.

[38] Yang Zhang,et al. Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles , 2015, PLoS Comput. Biol..

[39] Pedro A Fernandes,et al. Hot spots—A review of the protein–protein interface determinant amino‐acid residues , 2007, Proteins.

[40] Hua Tang,et al. Identification of Secretory Proteins in Mycobacterium tuberculosis Using Pseudo Amino Acid Composition , 2016, BioMed research international.

[41] Hao Lin,et al. Eukaryotic and prokaryotic promoter prediction using hybrid approach , 2011, Theory in Biosciences.

[42] Wei Chen,et al. Prediction of phosphothreonine sites in human proteins by fusing different features , 2016, Scientific Reports.

[43] David Baker,et al. Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[44] Yair Neuman. The Definition of Life and the Life of a Definition , 2012, Journal of biomolecular structure & dynamics.

[45] Ning Ma,et al. BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[46] Juan Fernández-Recio,et al. SKEMPI: a Structural Kinetic and Energetic database of Mutant Protein Interactions and its use in empirical models , 2012, Bioinform..

[47] Ronald Meester. Simulation of biological evolution and the NFL theorems , 2009, Biology & philosophy.

[48] B. Rost,et al. Protein function in precision medicine: deep understanding with machine learning , 2016, FEBS letters.

[49] Hui Ding,et al. The prediction of protein structural class using averaged chemical shifts , 2012, Journal of biomolecular structure & dynamics.

[50] T. N. Bhat,et al. The Protein Data Bank , 2000, Nucleic Acids Res..

[51] Kurt S. Thorn,et al. ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions , 2001, Bioinform..

[52] Hao Lin,et al. Predicting subcellular localization of mycobacterial proteins by using Chou's pseudo amino acid composition. , 2008, Protein and peptide letters.