From protein interactions to functional annotation: graph alignment in Herpes

BackgroundSequence alignment is a prolific basis of functional annotation, but remains a challenging problem in the 'twilight zone' of high sequence divergence or short gene length. Here we demonstrate how information on gene interactions can help to resolve ambiguous sequence alignments. We compare two distant Herpes viruses by constructing a graph alignment, which is based jointly on the similarity of their protein interaction networks and on sequence similarity. This hybrid method provides functional associations between proteins of the two organisms that cannot be obtained from sequence or interaction data alone.ResultsWe find proteins where interaction similarity and sequence similarity are individually weak, but together provide significant evidence of orthology. There are also proteins with high interaction similarity but without any detectable sequence similarity, providing evidence of functional association beyond sequence homology. The functional predictions derived from our alignment are consistent with genomic position and gene expression data.ConclusionOur approach shows that evolutionary conservation is a powerful filter to make protein interaction data informative about functional similarities between the interacting proteins, and it establishes graph alignment as a powerful tool for the comparative analysis of data from highly diverged species.

[1]  M. Nadeau,et al.  Proteins : Structure , Function , and Bioinformatics , 2022 .

[2]  S. Altschul,et al.  A tool for multiple sequence alignment. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[3]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Yue Lu,et al.  A Polynomial Time Solvable Formulation of Multiple Sequence Alignment , 2005, RECOMB.

[5]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[6]  Michael J E Sternberg,et al.  The identification of similarities between biological networks: application to the metabolome and interactome. , 2007, Journal of molecular biology.

[7]  H. Sato,et al.  Varicella-zoster virus (VZV) ORF65 virion protein is dispensable for replication in cell culture and is phosphorylated by casein kinase II, but not by the VZV protein kinases. , 2001, Virology.

[8]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[9]  Gary D. Stormo,et al.  Pairwise local structural alignment of RNA sequences with sequence similarity less than 40% , 2005, Bioinform..

[10]  Joel D. Baines,et al.  Characterization of the UL33 Gene Product of Herpes Simplex Virus 1 , 2000 .

[11]  Frances M. G. Pearl,et al.  VIDA: a virus database system for the organization of animal virus genome open reading frames , 2001, Nucleic Acids Res..

[12]  V. Beneš,et al.  Nucleotide sequence analysis of a 30-kb region of the bovine herpesvirus 1 genome which exhibits a colinear gene arrangement with the UL21 to UL4 genes of herpes simplex virus. , 1995, Virology.

[13]  Mong-Hsun Tsai,et al.  Dissection of the Kaposi's Sarcoma-Associated Herpesvirus Gene Expression Program by Using the Viral DNA Replication Inhibitor Cidofovir , 2004, Journal of Virology.

[14]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[15]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[16]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[17]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[18]  V. Georgiev Virology , 1955, Nature.

[19]  S. Oliver Proteomics: Guilt-by-association goes global , 2000, Nature.

[20]  T. Pennington,et al.  The Journal of General Virology , 1973 .

[21]  J. Russo,et al.  Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8). , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[23]  Hwa,et al.  Similarity detection and localization. , 1995, Physical review letters.

[24]  A J Davison,et al.  The complete DNA sequence of varicella-zoster virus. , 1986, The Journal of general virology.

[25]  S. Karlin,et al.  Applications and statistics for multiple high-scoring segments in molecular sequences. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[26]  References , 1971 .

[27]  Terence Hwa,et al.  Statistical Significance of Probabilistic Sequence Alignment and Related Local Hidden Markov Models , 2001, J. Comput. Biol..

[28]  R. Bundschuh,et al.  Asymmetric exclusion process and extremal statistics of random sequences. , 1999, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Lijun Wu,et al.  Virion Proteins of Kaposi's Sarcoma-Associated Herpesvirus , 2005, Journal of Virology.

[30]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[31]  Bonnie Berger,et al.  Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology , 2007, RECOMB.

[32]  Detlef D. Leipe,et al.  National Center for Biotechnology Information Viral Genomes Project , 2004, Journal of Virology.

[33]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[34]  M. Lynch The frailty of adaptive hypotheses for the origins of organismal complexity , 2007, Proceedings of the National Academy of Sciences.

[35]  B. Berger,et al.  Herpesviral Protein Networks and Their Interaction with the Human Proteome , 2006, Science.

[36]  Johannes Berg,et al.  Cross-species analysis of biological networks by Bayesian alignment. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Nigam H. Shah,et al.  Current progress in network research: toward reference networks for key model organisms , 2007, Briefings Bioinform..

[38]  Paolo Toth,et al.  Linear Assignment Problems , 1987 .

[39]  Chris Upton,et al.  Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes , 2000, Bioinform..

[40]  Burkhard Morgenstern,et al.  DIALIGN: multiple DNA and protein sequence alignment at BiBiServ , 2004, Nucleic Acids Res..

[41]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[42]  P. Bork,et al.  Non-orthologous gene displacement. , 1996, Trends in genetics : TIG.

[43]  Julia V Ponomarenko,et al.  Assigning new GO annotations to protein data bank sequences by combining structure and sequence homology , 2005, Proteins.

[44]  O. Gotoh Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. , 1996, Journal of molecular biology.

[45]  D. McGeoch,et al.  Molecular phylogeny of the alphaherpesvirinae subfamily and a proposed evolutionary timescale. , 1994, Journal of molecular biology.

[46]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Paul Kellam,et al.  Kaposi's Sarcoma-Associated Herpesvirus Latent and Lytic Gene Expression as Revealed by DNA Arrays , 2001, Journal of Virology.

[48]  Š. Němečková,et al.  Characterization of interaction of gH and gL glycoproteins of varicella-zoster virus: their processing and trafficking. , 2000, The Journal of general virology.

[49]  Amir Dembo,et al.  LIMIT DISTRIBUTIONS OF MAXIMAL SEGMENTAL SCORE AMONG MARKOV-DEPENDENT PARTIAL SUMS , 1992 .

[50]  Magnus Rattray,et al.  Reconstruction of ancestral protein interaction networks for the bZIP transcription factors , 2007, Proceedings of the National Academy of Sciences.

[51]  Michael Bittner,et al.  Transcription Program of Human Herpesvirus 8 (Kaposi's Sarcoma-Associated Herpesvirus) , 2001, Journal of Virology.

[52]  Sourav Bandyopadhyay,et al.  Systematic identification of functional orthologs based on protein network comparison. , 2006, Genome research.

[53]  Won-Jong Jang,et al.  Origin-Independent Assembly of Kaposi's Sarcoma-Associated Herpesvirus DNA Replication Compartments in Transient Cotransfection Assays and Association with the ORF-K8 Protein and Cellular PML , 2001, Journal of Virology.

[54]  Shi-Hua Zhang,et al.  Alignment of molecular networks by integer quadratic programming , 2007, Bioinform..

[55]  Shi-Hua Zhang,et al.  Biomolecular network querying: a promising approach in systems biology , 2008, BMC Systems Biology.

[56]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[57]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[58]  C A Orengo,et al.  Genomewide function conservation and phylogeny in the Herpesviridae. , 2001, Genome research.