Amino Acid Encoding Methods for Protein Sequences: A Comprehensive Review and Assessment
暂无分享,去创建一个
Ruqian Lu | Qiwen Dong | Xiaoyang Jing | D C Hong | Q. Dong | Xiaoyang Jing | Ruqian Lu | Daocheng Hong
[1] Lukasz A. Kurgan,et al. Review and comparative assessment of sequence‐based predictors of protein‐binding residues , 2018, Briefings Bioinform..
[2] Alice C McHardy,et al. Probabilistic variable-length segmentation of protein sequences for discriminative motif discovery (DiMotif) and sequence embedding (ProtVecX) , 2018, Scientific Reports.
[3] Ersin Emre Oren,et al. BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm436 Sequence analysis , 2022 .
[4] Albert Y. Zomaya,et al. Machine Learning Techniques for Protein Secondary Structure Prediction:An Overview and Evaluation , 2008 .
[5] Hae-Jin Hu,et al. Improved protein secondary structure prediction using support vector machine with a new encoding scheme and an advanced tertiary classifier , 2004, IEEE Transactions on NanoBioscience.
[6] Xiaolong Wang,et al. A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis , 2008, BMC Bioinformatics.
[7] S. K. Riis,et al. Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. , 1996, Journal of computational biology : a journal of computational molecular cell biology.
[8] Jaime G. Carbonell,et al. Comparative n-gram analysis of whole-genome protein sequences , 2002 .
[9] Robert David,et al. Applications of nonlinear system identification to protein structural prediction , 2000 .
[10] W. Kabsch,et al. Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.
[11] Ehsaneddin Asgari,et al. Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics , 2015, PloS one.
[12] R. Jernigan,et al. Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .
[13] Zachary Wu,et al. Learned protein embeddings for machine learning , 2018, Bioinformatics.
[14] A. Tramontano,et al. Critical assessment of methods of protein structure prediction (CASP)—Round XII , 2018, Proteins.
[15] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[16] The Uniprot Consortium,et al. UniProt: a hub for protein information , 2014, Nucleic Acids Res..
[17] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[18] Byunghan Lee,et al. Deep learning in bioinformatics , 2016, Briefings Bioinform..
[19] María Martín,et al. UniProt: A hub for protein information , 2015 .
[20] S. Henikoff,et al. Automated assembly of protein blocks for database searching. , 1991, Nucleic acids research.
[21] T. D. Schneider,et al. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. , 1982, Nucleic acids research.
[22] R. Russell,et al. Amino‐Acid Properties and Consequences of Substitutions , 2003 .
[23] Kuldip K. Paliwal,et al. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? , 2016, Briefings Bioinform..
[24] Hongbo Mu,et al. An ensemble approach to protein fold classification by integration of template‐based assignment and support vector machine classifier , 2016, Bioinform..
[25] Johannes Schuchhardt,et al. Adaptive encoding neural networks for the recognition of human signal peptide cleavage sites , 2000, Bioinform..
[26] Qin Lu,et al. CNNH_PSS: protein 8-class secondary structure prediction by convolutional neural network with highway , 2018, BMC Bioinformatics.
[27] Theodoros Damoulas,et al. Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection , 2008, Bioinform..
[28] Jiangning Song,et al. PhosContext2vec: a distributed representation of residue-level sequence contexts and its application to general and kinase-specific phosphorylation site prediction , 2018, Scientific Reports.
[29] J R Banavar,et al. Learning effective amino acid interactions through iterative stochastic techniques , 2000, Proteins.
[30] L. Kier,et al. Amino acid side chain parameters for correlation studies in biology and pharmacology. , 2009, International journal of peptide and protein research.
[31] G J Barton,et al. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.
[32] Wei Zheng,et al. A large-scale comparative assessment of methods for residue–residue contact prediction , 2016, Briefings Bioinform..
[33] H. Scheraga,et al. Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.
[34] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[35] Junjie Chen,et al. A comprehensive review and comparison of different computational methods for protein remote homology detection , 2018, Briefings Bioinform..
[36] S. Henikoff,et al. Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.
[37] Stefan C. Kremer,et al. Amino acid encoding schemes for machine learning methods , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).
[38] Bin Liu,et al. BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches , 2019, Briefings Bioinform..
[39] Zhen Li,et al. Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks , 2016, IJCAI.
[40] Guoli Wang,et al. PISCES: a protein sequence culling server , 2003, Bioinform..
[41] M. O. Dayhoff,et al. 22 A Model of Evolutionary Change in Proteins , 1978 .
[42] W. Atchley,et al. Solving the protein sequence metric problem. , 2005, Proceedings of the National Academy of Sciences of the United States of America.
[43] Siby Abraham,et al. Reaching Optimized Parameter Set, Protein Secondary Structure Prediction Using Neural Network , 2018 .
[44] A. Biegert,et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.
[45] Steven E. Brenner,et al. SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures , 2013, Nucleic Acids Res..
[46] Richard Wolfenden,et al. Comparing the polarities of the amino acids: side-chain distribution coefficients between the vapor phase, cyclohexane, 1-octanol, and neutral aqueous solution , 1988 .
[47] Junjie Chen,et al. Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences , 2015, Nucleic Acids Res..
[48] A. Godzik,et al. Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? , 1997, Protein science : a publication of the Protein Society.
[49] Anders Krogh,et al. Improving Predicition of Protein Secondary Structure Using Structured Neural Networks and Multiple Sequence Alignments , 1996, J. Comput. Biol..
[50] C. Anfinsen. Principles that govern the folding of protein chains. , 1973, Science.
[51] Jian Peng,et al. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields , 2015, Scientific Reports.
[52] G. Rose,et al. Hydrophobicity of amino acid residues in globular proteins. , 1985, Science.
[53] R. Jernigan,et al. Self‐consistent estimation of inter‐residue protein contact energies based on an equilibrium mixture approximation of residues , 1999, Proteins.
[54] A G Murzin,et al. SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.
[55] Anna Tramontano,et al. Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.
[56] Kuldip K. Paliwal,et al. Capturing non‐local interactions by long short‐term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility , 2017, Bioinform..
[57] Arne Elofsson,et al. A study on protein sequence alignment quality , 2002, Proteins.
[58] Ole Winther,et al. Deep Recurrent Conditional Random Field Network for Protein Secondary Prediction , 2017, BCB.
[59] Dennis Shasha,et al. New techniques for extracting features from protein sequences , 2001, IBM Syst. J..
[60] S H Kim,et al. Environment-dependent residue contact energies for proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.
[61] S. W. Atlanta. Using a neural network to backtranslate amino acid sequences , .
[62] Jie Hou,et al. DeepSF: deep convolutional neural network for mapping protein sequences to folds , 2017, Bioinform..
[63] Brian R. King,et al. Mining for class-specific motifs in protein sequence classification , 2012, BMC Bioinformatics.
[64] S F Altschul,et al. Iterated profile searches with PSI-BLAST--a tool for discovery in protein databases. , 1998, Trends in biochemical sciences.
[65] R Dustin Schaeffer,et al. CASP 11 target classification , 2016, Proteins.
[66] R. Jernigan,et al. Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.
[67] Jens Meiler,et al. Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks , 2001 .
[68] William R Taylor,et al. Amino acid encoding schemes from protein structure alignments: multi-dimensional vectors to describe residue types. , 2002, Journal of theoretical biology.
[69] P.C. Tai,et al. Parallel protein secondary structure prediction based on neural networks , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.
[70] Minoru Kanehisa,et al. AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..