A review of machine learning methods to predict the solubility of overexpressed recombinant proteins in Escherichia coli
暂无分享,去创建一个
Siti Zaiton Mohd Hashim | Narjeskhatoon Habibi | Alireza Norouzi | Mohammed Samian | Alireza Norouzi | S. Hashim | M. Samian | N. Habibi
[1] Jin-Kao Hao,et al. Pattern Recognition in Bioinformatics , 2013, Lecture Notes in Computer Science.
[2] Pierre Baldi,et al. SOLpro: accurate sequence-based prediction of protein solubility , 2009, Bioinform..
[3] Philip E. Bourne,et al. The RCSB PDB information portal for structural genomics , 2005, Nucleic Acids Res..
[4] Emanuele Tomba,et al. Prediction of protein solubility in Escherichia coli using logistic regression , 2010, Biotechnology and bioengineering.
[5] Jack Sklansky,et al. On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..
[6] John D. Westbrook,et al. TargetDB: a target registration database for structural genomics projects , 2004, Bioinform..
[7] A. A. Mullin,et al. Principles of neurodynamics , 1962 .
[8] Gregory Piatetsky-Shapiro,et al. Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.
[9] David E Hill,et al. High-throughput expression of C. elegans proteins. , 2004, Genome research.
[10] Wen-Liang Chen,et al. Prediction and analysis of protein solubility using a novel scoring card method with dipeptide composition , 2012, BMC Bioinformatics.
[11] Marko Robnik-Sikonja,et al. Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.
[12] Pankaj Kumar,et al. Granular Support Vector Machine Based Method for Prediction of Solubility of Proteins on Overexpression in Escherichia Coli , 2007, PReMI.
[13] Mark Gerstein,et al. Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis. , 2004, Journal of molecular biology.
[14] Aixia Guo,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2014 .
[15] Dmitrij Frishman,et al. Protein solubility: sequence based prediction and experimental verification , 2007, Bioinform..
[16] R G Harrison,et al. New fusion protein systems designed to give soluble expression in Escherichia coli. , 1999, Biotechnology and bioengineering.
[17] Mark Gerstein,et al. SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics , 2001, Nucleic Acids Res..
[18] BMC Bioinformatics , 2005 .
[19] Bernhard Schölkopf,et al. Feature selection and transduction for prediction of molecular bioactivity for drug design , 2003, Bioinform..
[20] H. B. Mann,et al. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .
[21] Marcel J. T. Reinders,et al. Exploring Sequence Characteristics Related to High-Level Production of Secreted Proteins in Aspergillus niger , 2012, PloS one.
[22] Thomas G. Dietterich. Editorial Exploratory research in machine learning , 1990, Machine Learning.
[23] Susan Idicula-Thomas,et al. Understanding the relationship between the primary structure of proteins and its propensity to be soluble on overexpression in Escherichia coli , 2005, Protein science : a publication of the Protein Society.
[24] J. Kittler,et al. Feature Set Search Alborithms , 1978 .
[25] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[26] Z. R. Li,et al. Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..
[27] Mark Gerstein,et al. Structural proteomics of an archaeon , 2000, Nature Structural Biology.
[28] Shoji Takada,et al. Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins , 2009, Proceedings of the National Academy of Sciences.
[29] Chun-Nan Hsu,et al. Learning to predict expression efficacy of vectors in recombinant protein production , 2010, BMC Bioinformatics.
[30] Pedro Larrañaga,et al. A review of feature selection techniques in bioinformatics , 2007, Bioinform..
[31] T. N. Bhat,et al. The Protein Data Bank , 2000, Nucleic Acids Res..
[32] Shuichi Hirose,et al. Statistical analysis of features associated with protein expression/solubility in an in vivo Escherichia coli expression system and a wheat germ cell-free expression system. , 2011, Journal of biochemistry.
[33] William Frawley,et al. Knowledge Discovery in Databases , 1991 .
[34] Jiangning Song,et al. Bioinformatics approaches for improved recombinant protein production in Escherichia coli: protein solubility prediction , 2014, Briefings Bioinform..
[35] Peter Kokol,et al. Stability of different feature selection methods for selecting protein sequence descriptors in protein solubility classification problem , 2010, 2010 IEEE 23rd International Symposium on Computer-Based Medical Systems (CBMS).
[36] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[37] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .
[38] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[39] Dmitrij Frishman,et al. PROSO II – a new method for protein solubility prediction , 2012, The FEBS journal.
[40] Alberto Maria Segre,et al. Programs for Machine Learning , 1994 .
[41] Shuichi Hirose,et al. ESPRESSO: A system for estimating protein expression and solubility in protein expression systems , 2013, Proteomics.
[42] LarrañagaPedro,et al. A review of feature selection techniques in bioinformatics , 2007 .
[43] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.
[44] Zhong Wang,et al. Prediction of protein solubility in E. coli , 2012, 2012 IEEE 8th International Conference on E-Science.
[45] Chi Hau Chen,et al. Pattern recognition and signal processing , 1978 .
[46] Jianwen Fang,et al. Discrimination of soluble and aggregation-prone proteins based on sequence information. , 2013, Molecular bioSystems.
[47] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[48] David L. Wilkinson,et al. Predicting the Solubility of Recombinant Proteins in Escherichia coli , 1991, Bio/Technology.
[49] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[50] L. Jiang,et al. PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..
[51] Sumio Sugano,et al. Human Gene and Protein Database (HGPD): a novel database presenting a large quantity of experiment-based results in human proteomics , 2009, Nucleic Acids Res..
[52] Michele Vendruscolo,et al. Sequence-based prediction of protein solubility. , 2012, Journal of molecular biology.
[53] Bhaskar D. Kulkarni,et al. A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in Escherichia coli , 2006, Bioinform..
[54] P. Kokol,et al. Comprehensive Decision Tree Models in Bioinformatics , 2012, PloS one.
[55] L. N. Kanal,et al. Handbook of Statistics, Vol. 2. Classification, Pattern Recognition and Reduction of Dimensionality. , 1985 .
[56] Feng Shi,et al. Predicting the protein solubility by integrating chaos games representation and entropy in information theory , 2014, Expert Syst. Appl..