Genome Diversity of Epstein-Barr Virus from Multiple Tumor Types and Normal Infection

ABSTRACT Epstein-Barr virus (EBV) infects most of the world's population and is causally associated with several human cancers, but little is known about how EBV genetic variation might influence infection or EBV-associated disease. There are currently no published wild-type EBV genome sequences from a healthy individual and very few genomes from EBV-associated diseases. We have sequenced 71 geographically distinct EBV strains from cell lines, multiple types of primary tumor, and blood samples and the first EBV genome from the saliva of a healthy carrier. We show that the established genome map of EBV accurately represents all strains sequenced, but novel deletions are present in a few isolates. We have increased the number of type 2 EBV genomes sequenced from one to 12 and establish that the type 1/type 2 classification is a major feature of EBV genome variation, defined almost exclusively by variation of EBNA2 and EBNA3 genes, but geographic variation is also present. Single nucleotide polymorphism (SNP) density varies substantially across all known open reading frames and is highest in latency-associated genes. Some T-cell epitope sequences in EBNA3 genes show extensive variation across strains, and we identify codons under positive selection, both important considerations for the development of vaccines and T-cell therapy. We also provide new evidence for recombination between strains, which provides a further mechanism for the generation of diversity. Our results provide the first global view of EBV sequence variation and demonstrate an effective method for sequencing large numbers of genomes to further understand the genetics of EBV infection. IMPORTANCE Most people in the world are infected by Epstein-Barr virus (EBV), and it causes several human diseases, which occur at very different rates in different parts of the world and are linked to host immune system variation. Natural variation in EBV DNA sequence may be important for normal infection and for causing disease. Here we used rapid, cost-effective sequencing to determine 71 new EBV sequences from different sample types and locations worldwide. We showed geographic variation in EBV genomes and identified the most variable parts of the genome. We identified protein sequences that seem to have been selected by the host immune system and detected variability in known immune epitopes. This gives the first overview of EBV genome variation, important for designing vaccines and immune therapy for EBV, and provides techniques to investigate relationships between viral sequence variation and EBV-associated diseases.

[1]  P. Sham,et al.  Genomic Diversity of Epstein-Barr Virus Genomes Isolated from Primary Nasopharyngeal Carcinoma Biopsy Samples , 2014, Journal of Virology.

[2]  A. Rickinson Co-infections, inflammation and oncogenesis: future directions for EBV research. , 2014, Seminars in cancer biology.

[3]  P. Farrell,et al.  A Single Amino Acid in EBNA-2 Determines Superior B Lymphoblastoid Cell Line Growth Maintenance by Epstein-Barr Virus Type 1 EBNA-2 , 2014, Journal of Virology.

[4]  M. Mar Albà,et al.  Genome-Wide Analysis of Wild-Type Epstein–Barr Virus Genomes Derived from Healthy Individuals of the 1000 Genomes Project , 2014, Genome biology and evolution.

[5]  G. Hung,et al.  Identification and characterization of EBV genomes in spontaneously immortalized human peripheral blood B lymphocytes by NGS technology , 2013, BMC Genomics.

[6]  Moriah L. Szpara,et al.  Evolution and Diversity in Human Herpes Simplex Virus Genomes , 2013, Journal of Virology.

[7]  R. Feederle,et al.  Spontaneous lytic replication and epitheliotropism define an Epstein-Barr virus strain found in carcinomas. , 2013, Cell reports.

[8]  M. Allday EBV finds a polycomb-mediated, epigenetic solution to the problem of oncogenic stress responses triggered by infection , 2013, Front. Genet..

[9]  Kevin Y. Yip,et al.  Complete genomic sequence of Epstein-Barr virus in nasopharyngeal carcinoma cell line C666-1 , 2013, Infectious Agents and Cancer.

[10]  P. Kellam,et al.  Viral population analysis and minority-variant detection using short read next-generation sequencing , 2013, Philosophical Transactions of the Royal Society B: Biological Sciences.

[11]  Sergei L. Kosakovsky Pond,et al.  FUBAR: a fast, unconstrained bayesian approximation for inferring selection. , 2013, Molecular biology and evolution.

[12]  D. Hedges,et al.  Whole-Genome Sequencing of the Akata and Mutu Epstein-Barr Virus Strains , 2012, Journal of Virology.

[13]  P. Farrell,et al.  Epstein-Barr Virus Sequence Variation—Biology and Disease , 2012, Pathogens.

[14]  Ramón Doallo,et al.  CircadiOmics: integrating circadian genomics, transcriptomics, proteomics and metabolomics , 2012, Nature Methods.

[15]  Sergei L. Kosakovsky Pond,et al.  Detecting Individual Sites Subject to Episodic Diversifying Selection , 2012, PLoS genetics.

[16]  W. Pirovano,et al.  Toward almost closed genomes with GapFiller , 2012, Genome Biology.

[17]  S. Lok,et al.  Genomic Sequencing and Comparative Analysis of Epstein-Barr Virus Genome Isolated from Primary Nasopharyngeal Carcinoma Biopsy , 2012, PloS one.

[18]  Robert E. White,et al.  EBNA3B-deficient EBV promotes B cell lymphomagenesis in humanized mice and is found in human tumors. , 2012, The Journal of clinical investigation.

[19]  E. Leproust,et al.  Specific Capture and Whole-Genome Sequencing of Viruses from Clinical Samples , 2011, PloS one.

[20]  H. Varmus,et al.  Epstein-Barr Virus: An Important Vaccine Target for Cancer Prevention , 2011, Science Translational Medicine.

[21]  X. Fang,et al.  Direct Sequencing and Characterization of a Clinical Isolate of Epstein-Barr Virus from Nasopharyngeal Carcinoma Tissue by Using Next-Generation Sequencing Technology , 2011, Journal of Virology.

[22]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[23]  Robert E. White,et al.  Cellular Gene Expression That Correlates with EBER Expression in Epstein-Barr Virus-Infected Lymphoblastoid Cell Lines , 2011, Journal of Virology.

[24]  Sergei L. Kosakovsky Pond,et al.  Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology , 2010, Bioinform..

[25]  Vincent Moulton,et al.  RDP3: a flexible and fast computer program for analyzing recombination , 2010, Bioinform..

[26]  James K. Bonfield,et al.  Genome analysis Advance Access publication May 30, 2010 Gap5—editing , 2010 .

[27]  M. Berriman,et al.  Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps , 2010, Genome Biology.

[28]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[29]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[30]  Thomas M. Keane,et al.  ABACAS: algorithm-based automatic contiguation of assembled sequences , 2009, Bioinform..

[31]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[32]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[33]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[34]  David Posada,et al.  Automated phylogenetic detection of recombination using a genetic algorithm. , 2006, Molecular biology and evolution.

[35]  D. McGeoch,et al.  The genome of Epstein-Barr virus type 2 strain AG876. , 2006, Virology.

[36]  I. Ernberg,et al.  Genomic Sequence Analysis of Epstein-Barr Virus Strain GD1 from a Nasopharyngeal Carcinoma Patient , 2005, Journal of Virology.

[37]  Sergei L. Kosakovsky Pond,et al.  Not so different after all: a comparison of methods for detecting amino acid sites under selection. , 2005, Molecular biology and evolution.

[38]  Sergei L. Kosakovsky Pond,et al.  HyPhy: hypothesis testing using phylogenies , 2005, Bioinform..

[39]  D. McGeoch,et al.  Latent Gene Sequencing Reveals Familial Relationships among Chinese Epstein-Barr Virus Strains and Evidence for Positive Selection of A11 Epitope Changes , 2003, Journal of Virology.

[40]  A. Chan,et al.  HLA-A11-Restricted Epitope Polymorphism among Epstein-Barr Virus Strains in the Highly HLA-A11-Positive Chinese Population: Incidence and Immunogenicity of Variant Epitope Sequences , 2003, Journal of Virology.

[41]  A. Rickinson,et al.  Epstein–Barr virus–associated Burkitt lymphomagenesis selects for downregulation of the nuclear antigen EBNA2 , 2002, Nature Medicine.

[42]  C. Sample,et al.  An Epstein-Barr virus deletion mutant associated with fatal lymphoproliferative disease unresponsive to therapy with virus-specific CTLs. , 2001, Blood.

[43]  S. Leung,et al.  Novel Intertypic Recombinants of Epstein-Barr Virus in the Chinese Population , 2000, Journal of Virology.

[44]  N. Raab-Traub,et al.  Signature amino acid changes in latent membrane protein 1 distinguish Epstein-Barr virus strains. , 1999, Virology.

[45]  K. Lole,et al.  Full-Length Human Immunodeficiency Virus Type 1 Genomes from Subtype C-Infected Seroconverters in India, with Evidence of Intersubtype Recombination , 1999, Journal of Virology.

[46]  J. Burrows,et al.  Evolutionary dynamics of genetic variation in Epstein-Barr virus isolates of diverse geographical origins: evidence for immune pressure-independent genetic drift , 1997, Journal of virology.

[47]  G. Cooper,et al.  Isolation of intertypic recombinants of Epstein-Barr virus from T-cell-immunocompromised individuals , 1996, Journal of virology.

[48]  E. Kieff Epstein-Barr virus and its replication , 1996 .

[49]  U. Nater,et al.  Epstein-Barr virus. , 1991, The Journal of family practice.

[50]  E. Kieff,et al.  Epstein-Barr virus types 1 and 2 differ in their EBNA-3A, EBNA-3B, and EBNA-3C genes , 1990, Journal of virology.

[51]  E. Kieff,et al.  Distinction between Epstein-Barr virus type A (EBNA 2A) and type B (EBNA 2B) isolates extends to the EBNA 3 family of nuclear proteins , 1989, Journal of virology.

[52]  L. Young,et al.  Influence of the Epstein-Barr virus nuclear antigen EBNA 2 on the growth phenotype of virus-transformed B cells , 1987, Journal of virology.

[53]  P. L. Deininger,et al.  DNA sequence and expression of the B95-8 Epstein—Barr virus genome , 1984, Nature.