A rapid and accurate approach for prediction of interactomes from co-elution data (PrInCE)

BackgroundAn organism’s protein interactome, or complete network of protein-protein interactions, defines the protein complexes that drive cellular processes. Techniques for studying protein complexes have traditionally applied targeted strategies such as yeast two-hybrid or affinity purification-mass spectrometry to assess protein interactions. However, given the vast number of protein complexes, more scalable methods are necessary to accelerate interaction discovery and to construct whole interactomes. We recently developed a complementary technique based on the use of protein correlation profiling (PCP) and stable isotope labeling in amino acids in cell culture (SILAC) to assess chromatographic co-elution as evidence of interacting proteins. Importantly, PCP-SILAC is also capable of measuring protein interactions simultaneously under multiple biological conditions, allowing the detection of treatment-specific changes to an interactome. Given the uniqueness and high dimensionality of co-elution data, new tools are needed to compare protein elution profiles, control false discovery rates, and construct an accurate interactome.ResultsHere we describe a freely available bioinformatics pipeline, PrInCE, for the analysis of co-elution data. PrInCE is a modular, open-source library that is computationally inexpensive, able to use label and label-free data, and capable of detecting tens of thousands of protein-protein interactions. Using a machine learning approach, PrInCE offers greatly reduced run time, more predicted interactions at the same stringency, prediction of protein complexes, and greater ease of use over previous bioinformatics tools for co-elution data. PrInCE is implemented in Matlab (version R2017a). Source code and standalone executable programs for Windows and Mac OSX are available at https://github.com/fosterlab/PrInCE, where usage instructions can be found. An example dataset and output are also provided for testing purposes.ConclusionsPrInCE is the first fast and easy-to-use data analysis pipeline that predicts interactomes and protein complexes from co-elution data. PrInCE allows researchers without bioinformatics expertise to analyze high-throughput co-elution datasets.

[1]  L. Orci,et al.  Architecture of coatomer: molecular characterization of delta-COP and protein interactions within the complex , 1996, The Journal of cell biology.

[2]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[3]  Claire D. McWhite,et al.  A synthesis of over 9,000 mass spectrometry experiments reveals the core set of human protein complexes , 2016, bioRxiv.

[4]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[5]  H. Herzel,et al.  Is there a bias in proteome research? , 2001, Genome research.

[6]  F. Wieland,et al.  A single binding site for dilysine retrieval motifs and p23 within the gamma subunit of coatomer. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  R. Duden,et al.  COP I domains required for coatomer integrity, and novel interactions with ARF and ARF‐GAP , 2000, The EMBO journal.

[8]  Uwe Schlattner,et al.  Yeast Two-Hybrid, a Powerful Tool for Systems Biology , 2009, International journal of molecular sciences.

[9]  Claire D. McWhite,et al.  Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes , 2017, Molecular systems biology.

[10]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[11]  L. Foster,et al.  A high-throughput approach for measuring temporal changes in the interactome , 2012, Nature Methods.

[12]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[13]  Nichollas E. Scott,et al.  Interactome disassembly during apoptosis occurs independent of caspase cleavage , 2017, Molecular systems biology.

[14]  Greg W. Clark,et al.  Panorama of ancient metazoan macromolecular complexes , 2015, Nature.

[15]  P. Bork,et al.  Proteome Organization in a Genome-Reduced Bacterium , 2009, Science.

[16]  Henning Hermjakob,et al.  The complex portal - an encyclopaedia of macromolecular complexes , 2014, Nucleic Acids Res..

[17]  Andrei L. Turinsky,et al.  A Census of Human Soluble Protein Complexes , 2012, Cell.

[18]  S. Kanaya,et al.  Large-scale identification of protein-protein interaction of Escherichia coli K-12. , 2006, Genome research.

[19]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[20]  Edward M Marcotte,et al.  ComplexQuant: high-throughput computational pipeline for the global quantitative analysis of endogenous soluble protein complexes using high resolution protein HPLC and precision label-free LC/MS/MS. , 2013, Journal of proteomics.

[21]  K. Gunsalus,et al.  Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network , 2009, Nature Methods.

[22]  Marco Y. Hein,et al.  A Human Interactome in Three Quantitative Dimensions Organized by Stoichiometries and Abundances , 2015, Cell.

[23]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[24]  Arnaud Céol,et al.  3did: a catalog of domain-based interactions of known three-dimensional structure , 2013, Nucleic Acids Res..

[25]  A. Reichert,et al.  Complexome profiling identifies TMEM126B as a component of the mitochondrial complex I assembly complex. , 2012, Cell metabolism.

[26]  Hyungwon Choi,et al.  SAINT: Probabilistic Scoring of Affinity Purification - Mass Spectrometry Data , 2010, Nature Methods.

[27]  Nichollas E. Scott,et al.  Development of a computational framework for the analysis of protein correlation profiling and spatial proteomics experiments. , 2015, Journal of proteomics.

[28]  Wade H. Dunham,et al.  Affinity‐purification coupled to mass spectrometry: Basic principles and strategies , 2012, Proteomics.

[29]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[30]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.