PPIXpress: construction of condition-specific protein interaction networks based on transcript expression

UNLABELLED Protein-protein interaction networks are an important component of modern systems biology. Yet, comparatively few efforts have been made to tailor their topology to the actual cellular condition being studied. Here, we present a network construction method that exploits expression data at the transcript-level and thus reveals alterations in protein connectivity not only caused by differential gene expression but also by alternative splicing. We achieved this by establishing a direct correspondence between individual protein interactions and underlying domain interactions in a complete but condition-unspecific protein interaction network. This knowledge was then used to infer the condition-specific presence of interactions from the dominant protein isoforms. When we compared contextualized interaction networks of matched normal and tumor samples in breast cancer, our transcript-based construction identified more significant alterations that affected proteins associated with cancerogenesis than a method that only uses gene expression data. The approach is provided as the user-friendly tool PPIXpress. AVAILABILITY AND IMPLEMENTATION PPIXpress is available at https://sourceforge.net/projects/ppixpress/.

[1]  Sumio Sugano,et al.  Aberrant transcriptional regulations in cancers: genome, transcriptome and epigenome analysis of lung adenocarcinoma cell lines , 2014, Nucleic acids research.

[2]  Sherif Abou Elela,et al.  Cancer-associated regulation of alternative splicing , 2009, Nature Structural &Molecular Biology.

[3]  Marc S. Cortese,et al.  Flexible nets , 2005, The FEBS journal.

[4]  Elspeth A. Bruford,et al.  Genenames.org: the HGNC resources in 2015 , 2014, Nucleic Acids Res..

[5]  Robert D. Finn,et al.  iPfam: a database of protein family and domain interactions found in the Protein Data Bank , 2013, Nucleic Acids Res..

[6]  M. Vidal,et al.  Edgetic perturbation models of human inherited disorders , 2009, Molecular systems biology.

[7]  Pietro Liò,et al.  The BioMart community portal: an innovative alternative to large, centralized data repositories , 2015, Nucleic Acids Res..

[8]  Ilan Y. Smoly,et al.  Comparative Analysis of Human Tissue Interactomes Reveals Factors Leading to Tissue-Specific Manifestation of Hereditary Diseases , 2014, PLoS Comput. Biol..

[9]  Susumu Goto,et al.  Data, information, knowledge and principle: back to metabolism in KEGG , 2013, Nucleic Acids Res..

[10]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[11]  Bumki Min,et al.  IDDI: integrated domain-domain interaction and protein interaction analysis system , 2012, Proteome Science.

[12]  M. Vidal,et al.  Edgotype: a fundamental link between genotype and phenotype. , 2013, Current opinion in genetics & development.

[13]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[14]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[15]  Marcos J. Araúzo-Bravo,et al.  A unique Oct4 interface is crucial for reprogramming to pluripotency , 2013, Nature Cell Biology.

[16]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[17]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[18]  Alfonso Valencia,et al.  Most highly expressed protein-coding genes have a single dominant isoform. , 2015, Journal of proteome research.

[19]  Diego Miranda-Saavedra,et al.  Systematic identification of transcriptional regulatory modules from protein–protein interaction networks , 2013, Nucleic acids research.

[20]  Alfonso Valencia,et al.  Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function , 2012, Molecular biology and evolution.

[21]  Arnaud Céol,et al.  3did: a catalog of domain-based interactions of known three-dimensional structure , 2013, Nucleic Acids Res..

[22]  J. Manley,et al.  Alternative pre-mRNA splicing regulation in cancer: pathways and programs unhinged. , 2010, Genes & development.

[23]  Kara Dolinski,et al.  The BioGRID interaction database: 2015 update , 2014, Nucleic Acids Res..

[24]  Lusheng Wang,et al.  Protein complex prediction based on maximum matching with domain-domain interaction. , 2012, Biochimica et biophysica acta.

[25]  Matthew Fraser,et al.  InterProScan 5: genome-scale protein function classification , 2014, Bioinform..

[26]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[27]  Rafael C. Jimenez,et al.  The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases , 2013, Nucleic Acids Res..

[28]  Alex Bateman,et al.  Tissue-Specific Splicing of Disordered Segments that Embed Binding Motifs Rewires Protein Interaction Networks , 2012, Molecular cell.

[29]  Sailu Yellaboina,et al.  DOMINE: a comprehensive collection of known and predicted domain-domain interactions , 2010, Nucleic Acids Res..

[30]  Xinchen Wang,et al.  Tissue-specific alternative splicing remodels protein-protein interaction networks. , 2012, Molecular cell.

[31]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[32]  Roded Sharan,et al.  Enhancing the Prioritization of Disease-Causing Genes through Tissue Specific Protein Interaction Networks , 2012, PLoS Comput. Biol..

[33]  Rachael P. Huntley,et al.  QuickGO: a web-based tool for Gene Ontology searching , 2009, Bioinform..

[34]  István A. Kovács,et al.  Widespread Macromolecular Interaction Perturbations in Human Genetic Disorders , 2015, Cell.

[35]  V. Helms,et al.  A STIM2 splice variant negatively regulates store-operated calcium entry , 2015, Nature Communications.

[36]  Alexander Goncearenco,et al.  Coverage of protein domain families with structural protein-protein interactions: current progress and future trends. , 2014, Progress in biophysics and molecular biology.

[37]  J. Harrow,et al.  Transcriptome analysis of human tissues and cell lines reveals one dominant transcript per gene , 2013, Genome Biology.

[38]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[39]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[40]  Dmitri D. Pervouchine,et al.  The human transcriptome across tissues and individuals , 2015, Science.

[41]  Yukiko Matsuoka,et al.  Tissue-specific subnetworks and characteristics of publicly available human protein interaction databases , 2011, Bioinform..

[42]  J. Harrow,et al.  GENCODE: producing a reference annotation for ENCODE , 2006, Genome Biology.

[43]  E. Levanon,et al.  Identification of recurrent regulated alternative splicing events across human solid tumors , 2015, Nucleic acids research.

[44]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[45]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[46]  Dorothea Emig,et al.  AltAnalyze and DomainGraph: analyzing and visualizing exon expression data , 2010, Nucleic Acids Res..

[47]  Hisashi Kashima,et al.  Protein complex prediction via verifying and reconstructing the topology of domain-domain interactions , 2010, BMC Bioinformatics.

[48]  A. Sinha,et al.  Nodes occupying central positions in human tissue specific PPI networks are enriched with many splice variants , 2014, Proteomics.

[49]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[50]  Robert D. Finn,et al.  The challenge of increasing Pfam coverage of the human proteome , 2013, Database J. Biol. Databases Curation.

[51]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[52]  Ben Lehner,et al.  Tissue specificity and the human protein interaction network , 2009, Molecular systems biology.

[53]  David Talavera,et al.  Alternative splicing and protein interaction data sets , 2013, Nature Biotechnology.

[54]  Volkhard Helms,et al.  Identifying transcription factor complexes and their roles , 2014, Bioinform..

[55]  V. Solovyev,et al.  Automatic annotation of eukaryotic genes, pseudogenes and promoters , 2006, Genome Biology.

[56]  Alfonso Valencia,et al.  APPRIS: annotation of principal and alternative splice isoforms , 2012, Nucleic Acids Res..