Predicting networking couples for metabolic pathways of Arabidopsis

Given an enzyme-compound couple, how can we identify whether it belongs to a networking couple or non-networking couple? This is very important for investigating the metabolic pathways. To address this problem, a novel approach was developed that is featured by using the knowledge of gene ontology (GO), chemical functional group (FunG), and pseudo amino acid composition (PseAA) to represent the samples of enzyme-compound couples. Two basic identifiers were formulated: one is called “GOFunG”, and the other, “PseAA-FunG”. The prediction was operated by fusing these two basic identifiers into one. As a showcase, the metabolic pathways were investigated for Arabidopsis thaliana, a small flowering plant widely used as a model organism for studies of the cellular and molecular biology of flowering plants. The average overall success rate via the jackknife cross-validation tests for the 72 metabolic pathways in the Arabidopsis system was over 95%, suggesting that the current approach might become a very useful tool for studying metabolic pathways and many other problems in the cellular networking related areas.

[1]  Lin He,et al.  Application of Pseudo Amino Acid Composition for Predicting Protein Subcellular Location: Stochastic Signal Processing Approach , 2003, Journal of protein chemistry.

[2]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[3]  A. Tomasselli,et al.  A cumulative specificity model for proteases from human immunodeficiency virus types 1 and 2, inferred from statistical analysis of an extended substrate data base. , 1991, The Journal of biological chemistry.

[4]  R. Poorman,et al.  The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase as inferred from a database of in vivo substrates and from the in vitro glycosylation of proteins and peptides. , 1993, The Journal of biological chemistry.

[5]  H. Krebs,et al.  The role of citric acid in intermediate metabolism in animal tissues , 1937, FEBS letters.

[6]  Z. Feng,et al.  Prediction of the subcellular location of prokaryotic proteins based on a new representation of the amino acid composition. , 2001, Biopolymers.

[7]  Guo-Ping Zhou,et al.  Subcellular location prediction of apoptosis proteins , 2002, Proteins.

[8]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[9]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[10]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[11]  Guo-Ping Zhou,et al.  An Intriguing Controversy over Protein Structural Class Prediction , 1998, Journal of protein chemistry.

[12]  G P Zhou,et al.  Some insights into protein structural class prediction , 2001, Proteins.

[13]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[14]  Kuo-Chen Chou,et al.  Predicting enzyme family class in a hybridization space , 2004, Protein science : a publication of the Protein Society.

[15]  R. King,et al.  New approach to pharmacophore mapping and QSAR analysis using inductive logic programming. Application to thermolysin inhibitors and glycogen phosphorylase B inhibitors. , 2002, Journal of medicinal chemistry.

[16]  Zhi-Ping Feng,et al.  An overview on predicting the subcellular location of a protein , 2002, Silico Biol..

[17]  Zhi-Ping Feng,et al.  Prediction of protein structural class by amino acid and polypeptide composition. , 2002, European journal of biochemistry.

[18]  Donald Voet,et al.  Fundamentals of Biochemistry , 1999 .

[19]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[20]  K Nishikawa,et al.  The folding type of a protein is relevant to the amino acid composition. , 1986, Journal of biochemistry.

[21]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[22]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[23]  Emily Dimmer,et al.  GOA? - Use of Gene Ontology Annotation (GOA) for biological interpretation of '-omics' data and for validation of automatic annotation tools , 2004, Silico Biol..