Sensitive Detection of Viral Transcripts in Human Tumor Transcriptomes

In excess of % of human cancer incidents have a viral cofactor. Epidemiological studies of idiopathic human cancers indicate that additional tumor viruses remain to be discovered. Recent advances in sequencing technology have enabled systematic screenings of human tumor transcriptomes for viral transcripts. However, technical problems such as low abundances of viral transcripts in large volumes of sequencing data, viral sequence divergence, and homology between viral and human factors significantly confound identification of tumor viruses. We have developed a novel computational approach for detecting viral transcripts in human cancers that takes the aforementioned confounding factors into account and is applicable to a wide variety of viruses and tumors. We apply the approach to conducting the first systematic search for viruses in neuroblastoma, the most common cancer in infancy. The diverse clinical progression of this disease as well as related epidemiological and virological findings are highly suggestive of a pathogenic cofactor. However, a viral etiology of neuroblastoma is currently contested. We mapped transcriptomes of neuroblastoma as well as positive and negative controls to the human and all known viral genomes in order to detect both known and unknown viruses. Analysis of controls, comparisons with related methods, and statistical estimates demonstrate the high sensitivity of our approach. Detailed investigation of putative viral transcripts within neuroblastoma samples did not provide evidence for the existence of any known human viruses. Likewise, de-novo assembly and analysis of chimeric transcripts did not result in expression signatures associated with novel human pathogens. While confounding factors such as sample dilution or viral clearance in progressed tumors may mask viral cofactors in the data, in principle, this is rendered less likely by the high sensitivity of our approach and the number of biological replicates analyzed. Therefore, our results suggest that frequent viral cofactors of metastatic neuroblastoma are unlikely.

[1]  P Ansell,et al.  Childhood Acute Lymphoblastic Leukemia and Infections in the First Year of Life : A Report from the United Kingdom Childhood Cancer Study , 2007 .

[2]  John M. Coffin,et al.  Retroviral Pathogenesis -- Retroviruses , 1997 .

[3]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences: current status, policy and new initiatives , 2008, Nucleic Acids Res..

[4]  E. Holmes,et al.  Rates of evolutionary change in viruses: patterns and determinants , 2008, Nature Reviews Genetics.

[5]  Wei Liu,et al.  Discovery of DNA Viruses in Wild-Caught Mosquitoes Using Small RNA High throughput Sequencing , 2011, PloS one.

[6]  L. Young,et al.  Epstein–Barr virus: 40 years on , 2004, Nature Reviews Cancer.

[7]  Sheila Dodge,et al.  Pathogen discovery from human tissue by sequence-based computational subtraction. , 2003, Genomics.

[8]  J. Dubuisson,et al.  Molecular biology of bovine herpesvirus type 4. , 1992, Veterinary microbiology.

[9]  E. Cesarman,et al.  Identification of herpesvirus-like DNA sequences in AIDS-associated Kaposi's sarcoma. , 1994, Science.

[10]  J. Butel,et al.  Viral carcinogenesis: revelation of molecular mechanisms and etiology of human disease. , 2000, Carcinogenesis.

[11]  Roland Eils,et al.  SplicingCompass: differential splicing detection using RNA-Seq data , 2013, Bioinform..

[12]  Barbara Hero,et al.  Differential Expression of Neuronal Genes Defines Subtypes of Disseminated Neuroblastoma with Favorable and Unfavorable Outcome , 2006, Clinical Cancer Research.

[13]  D. Parkin,et al.  The global health burden of infection‐associated cancers in the year 2002 , 2006, International journal of cancer.

[14]  Harald zur Hausen,et al.  Infections Causing Human Cancer: ZUR HAUSEN:INFECT&CANCER O-BK , 2006 .

[15]  B. Thiers,et al.  Clonal Integration of a Polyomavirus in Human Merkel Cell Carcinoma , 2009 .

[16]  Martin Vingron,et al.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels , 2012, Bioinform..

[17]  P. Benos,et al.  Human Transcriptome Subtraction by Using Short Sequence Tags To Search for Tumor Viruses in Conjunctival Carcinoma , 2007, Journal of Virology.

[18]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[19]  Vonkaulla Kn LIVER IN REGULATION OF FIBRINOLYTIC ACTIVITY. , 1964 .

[20]  Claude Fauquet,et al.  Classification of papillomaviruses. , 2004, Virology.

[21]  Daniel R. Jeske,et al.  Some Suggestions for Teaching About Normal Approximations to Poisson and Binomial Distribution Functions , 2009 .

[22]  W. Lipkin Microbe Hunting , 2010, Microbiology and Molecular Biology Reviews.

[23]  Gudrun Schleiermacher,et al.  Overall genomic pattern is a predictor of outcome in neuroblastoma. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[24]  Margaret E McLaughlin-Drubin,et al.  Viruses associated with human cancer. , 2008, Biochimica et biophysica acta.

[25]  Reinhard Simon,et al.  Complete viral genome sequence and discovery of novel viruses by deep sequencing of small RNAs: a generic method for diagnosis, discovery and sequencing of viruses. , 2009, Virology.

[26]  S Karlin,et al.  Contrasts in codon usage of latent versus productive genes of Epstein-Barr virus: data and hypotheses , 1990, Journal of virology.

[27]  Marion Cornelissen,et al.  Gene expression profile of AIDS-related Kaposi's sarcoma , 2003, BMC Cancer.

[28]  J. Roganovic,et al.  Epidemiology of childhood cancer. , 1999, IARC scientific publications.

[29]  Xuan Li,et al.  Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study , 2011, BMC Bioinformatics.

[30]  G. Getz,et al.  PathSeq: software to identify or discover microbes by deep sequencing of human tissue , 2011, Nature Biotechnology.

[31]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[32]  Xiangqin Cui,et al.  Design and validation issues in RNA-seq experiments , 2011, Briefings Bioinform..

[33]  C E Koop,et al.  Special pattern of widespread neuroblastoma with a favourable prognosis. , 1971, Lancet.

[34]  Jay Shendure,et al.  Identification of foreign gene sequences by transcript filtering against the human genome , 2002, Nature Genetics.

[35]  Eric C Lai,et al.  Virus discovery by deep sequencing and assembly of virus-derived small silencing RNAs , 2010, Proceedings of the National Academy of Sciences.

[36]  A. Berk,et al.  Recent lessons in gene expression, cell cycle control, and cell biology from adenovirus , 2005, Oncogene.

[37]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[38]  Shou-Jiang Gao,et al.  Viruses and human cancer: from detection to causality. , 2011, Cancer letters.

[39]  P. Kellam,et al.  Metagenomics and the molecular identification of novel viruses , 2010, The Veterinary Journal.

[40]  Robin A. Weiss,et al.  Human RNA “Rumor” Viruses: the Search for Novel Human Retroviruses in Chronic Disease , 2008, Microbiology and Molecular Biology Reviews.

[41]  H. Hausen,et al.  The search for infectious causes of human cancers: where and why. , 2009, Virology.

[42]  Marc A Suchard,et al.  Using Time-Structured Data to Estimate Evolutionary Rates of Double-Stranded DNA Viruses , 2010, Molecular biology and evolution.

[43]  G. Klein,et al.  Epstein–Barr virus infection in humans: from harmless to life endangering virus–lymphocyte interactions , 2007, Oncogene.

[44]  Yuan Chang,et al.  Merkel Cell Polyomavirus-Infected Merkel Cell Carcinoma Cells Require Expression of Viral T Antigens , 2010, Journal of Virology.

[45]  Shane Wilson,et al.  CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes , 2012, BMC Bioinformatics.

[46]  W. Atwood,et al.  The human polyomaviruses , 2006, Cellular and Molecular Life Sciences CMLS.

[47]  O. Delattre,et al.  Molecular pathogenesis of peripheral neuroblastic tumors , 2010, Oncogene.

[48]  Feng Wang-Johanning,et al.  Quantitation of HERV-K env gene expression and splicing in human breast cancer , 2003, Oncogene.

[49]  T. Traavik,et al.  A possible contributory role of BK virus infection in neuroblastoma development. , 1999, Cancer research.

[50]  T. Miyata,et al.  Nucleotide sequence of human endogenous retrovirus genome related to the mouse mammary tumor virus genome , 1986, Journal of virology.

[51]  P. Moore,et al.  Why do viruses cause cancer? Highlights of the first century of human tumour virology , 2010, Nature Reviews Cancer.

[52]  Harald zur Hausen,et al.  Red meat consumption and cancer: Reasons to suspect involvement of bovine infectious factors in colorectal cancer , 2012 .

[53]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[54]  Tian-Li Wang,et al.  Identification of microbial DNA in human cancer , 2009, BMC Medical Genomics.

[55]  R. Edwards,et al.  The Phage Proteomic Tree: a Genome-Based Taxonomy for Phage , 2002, Journal of bacteriology.

[56]  Christophe Combet,et al.  Consensus proposals for a unified system of nomenclature of hepatitis C virus genotypes , 2005, Hepatology.

[57]  Andrew I. Bell,et al.  An Epstein-Barr Virus Anti-Apoptotic Protein Constitutively Expressed in Transformed Cells and Implicated in Burkitt Lymphomagenesis: The Wp/BHRF1 Link , 2009, PLoS pathogens.

[58]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[59]  D. Lowy,et al.  Vaccines to prevent infections by oncoviruses. , 2010, Annual review of microbiology.

[60]  Arnold J. Levine,et al.  The E6 oncoprotein encoded by human papillomavirus types 16 and 18 promotes the degradation of p53 , 1990, Cell.

[61]  Beate Ritz,et al.  The epidemiology of neuroblastoma: a review. , 2009, Paediatric and perinatal epidemiology.

[62]  Barbara Hero,et al.  FISH analyses for alterations in chromosomes 1, 2, 3, and 11 define high-risk groups in neuroblastoma. , 2003, Medical and pediatric oncology.

[63]  F. Rohwer,et al.  Metagenomics and future perspectives in virus discovery , 2012, Current Opinion in Virology.

[64]  René L. Warren,et al.  The Sensitivity of Massively Parallel Sequencing for Detecting Candidate Infectious Agents Associated with Human Tissue , 2011, PloS one.

[65]  Tom Starzl,et al.  THE LANCET , 1992, The Lancet.

[66]  B. Andresen,et al.  Genomic analysis of uncultured marine viral communities , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[67]  S. Dhanasekaran,et al.  Distinct classes of chromosomal rearrangements create oncogenic ETS gene fusions in prostate cancer , 2007, Nature.

[68]  Rajkumar Sasidharan,et al.  Infections Causing Human Cancer , 2008, The Yale Journal of Biology and Medicine.

[69]  N. Bannert,et al.  Retroelements and the human genome: New perspectives on an old relation , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[70]  S Metzenberg,et al.  Levels of Epstein-Barr virus DNA in lymphoblastoid cell lines are correlated with frequencies of spontaneous lytic growth but not with levels of expression of EBNA-1, EBNA-2, or latent membrane protein , 1990, Journal of virology.

[71]  Kestutis Sasnauskas,et al.  Maternal human polyomavirus infection and risk of neuroblastoma in the child , 2005, International journal of cancer.

[72]  Ashok Patowary,et al.  De novo identification of viral pathogens from cell culture hologenomes , 2012, BMC Research Notes.

[73]  J Gordon,et al.  Detection of human neurotropic JC virus DNA sequence and expression of the viral oncogenic protein in pediatric medulloblastomas. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[74]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[75]  Harald zur Hausen,et al.  Childhood leukemias and other hematopoietic malignancies: Interdependence between an infectious event and chromosomal modifications , 2009, International journal of cancer.

[76]  Gautier Koscielny,et al.  Ensembl 2012 , 2011, Nucleic Acids Res..

[77]  Andrew F Olshan,et al.  Day care, childhood infections, and risk of neuroblastoma. , 2004, American journal of epidemiology.

[78]  K. Münger,et al.  The human papilloma virus-16 E7 oncoprotein is able to bind to the retinoblastoma gene product. , 1989, Science.

[79]  Ting-Fung Chan,et al.  ViralFusionSeq: accurately discover viral integration events and reconstruct fusion transcripts at single-base resolution , 2013, Bioinform..

[80]  Alla Lapidus,et al.  A Bioinformatician's Guide to Metagenomics , 2008, Microbiology and Molecular Biology Reviews.

[81]  Meredith S Irwin,et al.  Natural course of low risk neuroblastoma , 2012, Pediatric blood & cancer.

[82]  S. Arron,et al.  Transcriptome Sequencing Demonstrates that Human Papillomavirus is not Active in Cutaneous Squamous Cell Carcinoma , 2011, The Journal of investigative dermatology.

[83]  D. Elgui de Oliveira,et al.  DNA viruses in human cancer: an integrated overview on fundamental mechanisms of viral carcinogenesis. , 2007, Cancer letters.

[84]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[85]  J. Simons,et al.  A new arenavirus in a cluster of fatal transplant-associated diseases. , 2008, The New England journal of medicine.

[86]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[87]  T. Traavik,et al.  Human polyomavirus BK (BKV) and neuroblastoma: mechanisms of oncogenic action and possible strategy for novel treatment. , 2000, Medical and pediatric oncology.

[88]  J. Butel,et al.  The history of tumor virology. , 2008, Cancer research.

[89]  H. zur Hausen,et al.  Oncogenic DNA viruses , 2001, Oncogene.

[90]  G. Brodeur Neuroblastoma: biological insights into a clinical enigma , 2003, Nature Reviews Cancer.

[91]  H. zur Hausen,et al.  The search for infectious causes of human cancers: where and why. , 2009, Virology.

[92]  Song Liu,et al.  FUSIM: a software tool for simulating fusion transcripts , 2013, BMC Bioinformatics.

[93]  Ofer Isakov,et al.  Pathogen detection using short-RNA deep sequencing subtraction and assembly , 2011, Bioinform..

[94]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[95]  Kun Qu,et al.  Rapid identification of non-human sequences in high-throughput sequencing datasets , 2012, Bioinform..

[96]  O. Kohlbacher,et al.  No evidence of viral genomes in whole‐transcriptome sequencing of three melanoma metastases , 2011, Experimental dermatology.

[97]  E. Robertson,et al.  Kaposi's Sarcoma-Associated Herpesvirus-Encoded Latency-Associated Nuclear Antigen Induces Chromosomal Instability through Inhibition of p53 Function , 2006, Journal of Virology.