Uncovering Arabidopsis Membrane Protein Interactome Enriched in Transporters Using Mating-Based Split Ubiquitin Assays and Classification Models

High-throughput data are a double-edged sword; for the benefit of large amount of data, there is an associated cost of noise. To increase reliability and scalability of high-throughput protein interaction data generation, we tested the efficacy of classification to enrich potential protein–protein interactions. We applied this method to identify interactions among Arabidopsis membrane proteins enriched in transporters. We validated our method with multiple retests. Classification improved the quality of the ensuing interaction network and was effective in reducing the search space and increasing true positive rate. The final network of 541 interactions among 239 proteins (of which 179 are transporters) is the first protein interaction network enriched in membrane transporters reported for any organism. This network has similar topological attributes to other published protein interaction networks. It also extends and fills gaps in currently available biological networks in plants and allows building a number of hypotheses about processes and mechanisms involving signal-transduction and transport systems.

[1]  Koen J. F. Verhoeven,et al.  Implementing false discovery rate control: increasing your power , 2005 .

[2]  N. Raikhel,et al.  Interactions between syntaxins identify at least five SNARE complexes within the Golgi/prevacuolar system of the Arabidopsis cell. , 2001, Molecular biology of the cell.

[3]  S. Somerville,et al.  MLO, a novel modulator of plant defenses and cell death, binds calmodulin. , 2002, Trends in plant science.

[4]  Dmitrij Frishman,et al.  The DIMA web resource - exploring the protein domain network , 2006, Bioinform..

[5]  S. Fields High‐throughput two‐hybrid analysis , 2005, The FEBS journal.

[6]  K. Castleman,et al.  Joint segmentation and classification of M-FISH chromosome images , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[7]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[8]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[9]  A. Varshavsky,et al.  Split ubiquitin as a sensor of protein interactions in vivo. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Kengo Kinoshita,et al.  ATTED-II provides coexpressed gene networks for Arabidopsis , 2008, Nucleic Acids Res..

[11]  Alan M. Jones,et al.  Two Seven-Transmembrane Domain MILDEW RESISTANCE LOCUS O Proteins Cofunction in Arabidopsis Root Thigmomorphogenesis[C][W] , 2009, The Plant Cell Online.

[12]  Henrik Svennerstam,et al.  Root uptake of cationic amino acids by Arabidopsis depends on functional expression of amino acid permease 5. , 2008, The New phytologist.

[13]  Falk Schreiber,et al.  MAVisto: a tool for the exploration of network motifs , 2005, Bioinform..

[14]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[15]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools , 2011, Nucleic Acids Res..

[16]  M. S. Mukhtar,et al.  Independently Evolved Virulence Effectors Converge onto Hubs in a Plant Immune System Network , 2011, Science.

[17]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[18]  M. Collinge,et al.  Interaction of a protein phosphatase with an Arabidopsis serine-threonine receptor kinase. , 1994, Science.

[19]  Jinbo Bi,et al.  Support Vector Classification with Input Data Uncertainty , 2004, NIPS.

[20]  E. Marcotte,et al.  Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana , 2010, Nature Biotechnology.

[21]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[22]  Y. Shin,et al.  YKT6 is a core constituent of membrane fusion machineries at the Arabidopsis trans-Golgi network. , 2005, Journal of molecular biology.

[23]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[24]  Matko Bosnjak,et al.  REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms , 2011, PloS one.

[25]  F. Gaymard,et al.  Strategies to identify transport systems in plants. , 2001, Trends in plant science.

[26]  Ian T. Paulsen,et al.  TransportDB: a comprehensive database resource for cytoplasmic membrane transport systems and outer membrane channels , 2006, Nucleic Acids Res..

[27]  S. Clark,et al.  CLAVATA3, a multimeric ligand for the CLAVATA1 receptor-kinase. , 2000, Science.

[28]  Shoshana J. Wodak,et al.  Markov clustering versus affinity propagation for the partitioning of protein interaction graphs , 2009, BMC Bioinformatics.

[29]  William Stafford Noble,et al.  Global mapping of protein-DNA interactions in vivo by digital genomic footprinting , 2009, Nature Methods.

[30]  Peter Uetz,et al.  The Two-Hybrid System , 2003 .

[31]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[32]  S. Bischof,et al.  In vivo interaction between atToc33 and atToc159 GTP-binding domains demonstrated in a plant split-ubiquitin system. , 2008, Journal of experimental botany.

[33]  Sylvie Lalonde,et al.  Molecular and cellular approaches for the detection of protein-protein interactions: latest techniques and current limitations. , 2008, The Plant journal : for cell and molecular biology.

[34]  Ian T. Paulsen,et al.  Comparative Analyses of Fundamental Differences in Membrane Transport Capabilities in Prokaryotes and Eukaryotes , 2005, PLoS Comput. Biol..

[35]  Wenhua Zhang,et al.  A Bifurcating Pathway Directs Abscisic Acid Effects on Stomatal Closure and Opening in Arabidopsis , 2006, Science.

[36]  Antonis Papachristodoulou,et al.  Efficient, sparse biological network determination , 2009, BMC Systems Biology.

[37]  J. Schroeder,et al.  A cyclic nucleotide-gated channel is essential for polarized tip growth of pollen , 2007, Proceedings of the National Academy of Sciences.

[38]  Jörg Durner,et al.  Conserved requirement for a plant host cell protein in powdery mildew pathogenesis , 2006, Nature Genetics.

[39]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[40]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[41]  W. Qin,et al.  Molecular Heterogeneity of Phospholipase D (PLD) , 1997, The Journal of Biological Chemistry.

[42]  S. L. Wong,et al.  Combining biological networks to predict genetic interactions. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[44]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[45]  M. Bennett,et al.  The auxin influx carriers AUX1 and LAX3 are involved in auxin-ethylene interactions during apical hook development in Arabidopsis thaliana seedlings , 2010, Development.

[46]  A. Sanderfoot,et al.  The Secretory System of Arabidopsis , 2008, The arabidopsis book.

[47]  Michael Sauer,et al.  Interactions among PIN-FORMED and P-Glycoprotein Auxin Transporters in Arabidopsis[W] , 2007, The Plant Cell Online.

[48]  Zhou Du,et al.  agriGO: a GO analysis toolkit for the agricultural community , 2010, Nucleic Acids Res..

[49]  Kengo Kinoshita,et al.  Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways , 2010, Journal of Plant Research.

[50]  N. Johnsson,et al.  Detection of altered protein conformations in living cells. , 2001, Journal of molecular biology.

[51]  C. Fusco,et al.  In vivo construction of cDNA libraries for use in the yeast two‐hybrid system , 1999, Yeast.

[52]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[53]  Erzsébet Ravasz,et al.  Detecting hierarchical modularity in biological networks. , 2009, Methods in molecular biology.

[54]  F. Bretz,et al.  Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni‐based closed tests , 2008, Statistics in medicine.

[55]  William Stafford Noble,et al.  Large-scale identification of yeast integral membrane protein interactions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[56]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[57]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[58]  Xin Chen,et al.  PAIR: the predicted Arabidopsis interactome resource , 2010, Nucleic Acids Res..

[59]  Jia Li,et al.  BAK1, an Arabidopsis LRR Receptor-like Protein Kinase, Interacts with BRI1 and Modulates Brassinosteroid Signaling , 2002, Cell.

[60]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[61]  Arun K. Ramani,et al.  Protein interaction networks from yeast to human. , 2004, Current opinion in structural biology.

[62]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[63]  Sebastian Wernicke,et al.  FANMOD: a tool for fast network motif detection , 2006, Bioinform..

[64]  Guillaume Pilot,et al.  A Membrane Protein/Signaling Protein Interaction Network for Arabidopsis Version AMPv2 , 2010, Front. Physio..

[65]  M. Jin,et al.  Interaction of the Arabidopsis Receptor Protein Kinase Wak1 with a Glycine-rich Protein, AtGRP-3* , 2001, The Journal of Biological Chemistry.

[66]  W. Frommer,et al.  Interactions between co-expressed Arabidopsis sucrose transporters in the split-ubiquitin system , 2003, BMC Biochemistry.

[67]  A. Murphy,et al.  TWISTED DWARF1, a unique plasma membrane-anchored immunophilin-like protein, interacts with Arabidopsis multidrug resistance-like transporters AtPGP1 and AtPGP19. , 2003, Molecular biology of the cell.

[68]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[69]  B. André,et al.  K+ channel interactions detected by a genetic system optimized for systematic studies of membrane protein interactions. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[70]  Lincoln Stein,et al.  The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations , 2008, Nucleic Acids Res..

[71]  Charles Elkan,et al.  The Transporter Classification Database: recent advances , 2008, Nucleic Acids Res..

[72]  P. Bork,et al.  Evolution of biomolecular networks — lessons from metabolic and protein interactions , 2009, Nature Reviews Molecular Cell Biology.

[73]  Jonathan D. G. Jones,et al.  Evidence for Network Evolution in an Arabidopsis Interactome Map , 2011, Science.

[74]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[75]  Ileana Streinu,et al.  Sparse Hypergraphs and Pebble Game Algorithms , 2007, Eur. J. Comb..

[76]  David Valle,et al.  Human disease genes , 2001, Nature.

[77]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[78]  L. Du,et al.  Identification of genes encoding receptor-like protein kinases as possible targets of pathogen- and salicylic acid-induced WRKY DNA-binding proteins in Arabidopsis. , 2000, The Plant journal : for cell and molecular biology.

[79]  U. Grossniklaus,et al.  Conserved Molecular Components for Pollen Tube Reception and Fungal Invasion , 2010, Science.

[80]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..