JFeature: A Java Package for Extracting Global Sequence Features from Proteins for Functional Classification

Prediction of various functional properties of proteins has long been a central theme of bioinformatics in the post-genomic era. Statistical learning, in addition to analysis based on sequence similarity, was proven successful to detect complex sequence-function associations in many applications. JFeature is an integrated Java tool to facilitate extraction of global sequence features and preparation of example sets, in statistical learning studies of sequence-function relationships. With a user-friendly graphical interface, it computes the composition, distribution, transition and auto-correlation features from sequence. It also helps to assemble a negative example set based on the most-dissimilar principle. The Java package and supplementary documentations are available at http://www.cls.zju.edu.cn/rlibs/software/jfeature.html. DOI: 10.4018/978-1-4666-3604-0.ch060

[1]  Wei Wang,et al.  Learning the drug target‐likeness of a protein , 2007, Proteomics.

[2]  Frederick P. Roth,et al.  Predicting co-complexed protein pairs using genomic and proteomic data integration , 2004, BMC Bioinformatics.

[3]  David Haussler,et al.  Classifying G-protein coupled receptors with support vector machines , 2002, Bioinform..

[4]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[5]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[6]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[7]  Z. Cao,et al.  Computer prediction of allergen proteins from sequence-derived protein structural and physicochemical properties. , 2007, Molecular immunology.

[8]  J. F. Wang,et al.  Prediction of P-Glycoprotein Substrates by a Support Vector Machine Approach , 2004, J. Chem. Inf. Model..

[9]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[10]  Y. Z. Chen,et al.  Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity Published, JLR Papers in Press, January 27, 2006. , 2006, Journal of Lipid Research.

[11]  X. Chen,et al.  SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence , 2003, Nucleic Acids Res..

[12]  Yongsheng Ding,et al.  Prediction of protein subcellular location using hydrophobic patterns of amino acid sequence , 2006, Comput. Biol. Chem..

[13]  William Stafford Noble,et al.  Learning to predict protein-protein interactions from protein sequences , 2003, Bioinform..

[14]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[15]  See-Kiong Ng,et al.  Biological Data Mining in Protein Interaction Networks , 2009 .

[16]  Juan José Rodríguez Diez,et al.  Classifier Ensemble Methods for Diagnosing COPD from Volatile Organic Compounds in Exhaled Air , 2012, Int. J. Knowl. Discov. Bioinform..

[17]  Ralf Zimmer,et al.  BioWeka - extending the Weka framework for bioinformatics , 2007, Bioinform..

[18]  Hesham H. Ali,et al.  Bioinformatics: Concepts, Methodologies, Tools, and Applications , 2013 .