EMu: probabilistic inference of mutational processes and their localization in the cancer genome

The spectrum of mutations discovered in cancer genomes can be explained by the activity of a few elementary mutational processes. We present a novel probabilistic method, EMu, to infer the mutational signatures of these processes from a collection of sequenced tumors. EMu naturally incorporates the tumor-specific opportunity for different mutation types according to sequence composition. Applying EMu to breast cancer data, we derive detailed maps of the activity of each process, both genome-wide and within specific local regions of the genome. Our work provides new opportunities to study the mutational processes underlying cancer development. EMu is available at http://www.sanger.ac.uk/resources/software/emu/.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[3]  F. Cleton Evolution of Cancer , 1991, British Journal of Cancer.

[4]  J. Simon,et al.  A role for sunlight in skin cancer: UV-induced p53 mutations in squamous cell carcinoma. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[5]  H. Prydz,et al.  CpG islands as gene markers in the human genome. , 1992, Genomics.

[6]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[7]  David N. Cooper,et al.  The CpG dinucleotide and human genetic disease , 1988, Human Genetics.

[8]  David R. Anderson,et al.  Multimodel Inference , 2004 .

[9]  Andrew D. Yates,et al.  Somatic mutations of the protein kinase gene family in human lung cancer. , 2005, Cancer research.

[10]  C. Maley,et al.  Cancer is a disease of clonal evolution within the body1–3. This has profound clinical implications for neoplastic progression, cancer prevention and cancer therapy. Although the idea of cancer as an evolutionary problem , 2006 .

[11]  Tracy T Batchelor,et al.  A hypermutation phenotype and somatic MSH6 mutations in recurrent human malignant gliomas after alkylator chemotherapy. , 2006, Cancer research.

[12]  Derek Y. Chiang,et al.  Characterizing the cancer genome in lung adenocarcinoma , 2007, Nature.

[13]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[14]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[15]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[16]  M. Stratton,et al.  The cancer genome , 2009, Nature.

[17]  Camille Stephan-Otto Attolini,et al.  Evolutionary Theory of Cancer , 2009, Annals of the New York Academy of Sciences.

[18]  Ali Taylan Cemgil,et al.  Nonnegative matrix factorizations as probabilistic inference in composite models , 2009, 2009 17th European Signal Processing Conference.

[19]  J. Tost,et al.  DNA methylation: an introduction to the biology and the disease-associated changes of a promising biomarker. , 2009, Methods in molecular biology.

[20]  J. Tost,et al.  DNA Methylation: An Introduction to the Biology and the Disease-Associated Changes of a Promising Biomarker , 2010, Molecular biotechnology.

[21]  Tom Royce,et al.  A comprehensive catalogue of somatic mutations from a human cancer genome , 2010, Nature.

[22]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[23]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[24]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[25]  B. Schuster-Böckler,et al.  Chromatin organization is a major influence on regional mutation rates in human cancer cells , 2012, Nature.

[26]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[27]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[28]  Li Ding,et al.  Genomic Landscape of Non-Small Cell Lung Cancer in Smokers and Never-Smokers , 2012, Cell.

[29]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[30]  Peter J. Campbell,et al.  Evolution of the cancer genome , 2012, Nature Reviews Genetics.

[31]  A. Børresen-Dale,et al.  Mutational Processes Molding the Genomes of 21 Breast Cancers , 2012, Cell.

[32]  Shamil R. Sunyaev,et al.  Impact of deleterious passenger mutations on cancer progression , 2012, Proceedings of the National Academy of Sciences.

[33]  M. Stratton,et al.  Deciphering Signatures of Mutational Processes Operative in Human Cancer , 2013, Cell reports.