An improved burden-test pipeline for identifying associations from rare germline and somatic variants

BackgroundIdentifying rare germline and somatic variants associated with cancer progression is an important research topic in cancer genomics. Although many approaches are proposed for rare variant association study, they are not fit for cancer sequencing data due to multiple issues, such as overly relying on pre-selection, losing sight of interacting hotspots, etc.ResultsIn this article, we propose an improved pipeline to identify germline variant and somatic mutation interactions influencing cancer susceptibility from pair-wise cancer sequencing data. The proposed pipeline, RareProb-C performs an algorithmic selection on the given variants by incorporating the variant allelic frequencies. The interactions among the variants are considered within the regions which are limited by a four-gamete test. Then it filters singular cases according to the posterior probability at each site. Finally, it outputs the selected candidates that pass a collapse test.ConclusionsWe apply RareProb-C on a series of carefully constructed simulation cases and it outperforms six existing genetic model-free approaches. We also test RareProb-C on 429 TCGA ovarian cancer cases, and RareProb-C successfully identifies the known highlighted variants which are considered increasing disease susceptibilities.

[1]  Eleazar Eskin,et al.  Increasing Power of Groupwise Association Test with Likelihood Ratio Test , 2011, RECOMB.

[2]  Joshua F. McMichael,et al.  Age-related cancer mutations associated with clonal hematopoietic expansion , 2014, Nature Medicine.

[3]  Xihong Lin,et al.  Optimal tests for rare variant effects in sequencing association studies. , 2012, Biostatistics.

[4]  T. Hampton,et al.  The Cancer Genome Atlas , 2020, Indian Journal of Medical and Paediatric Oncology.

[5]  Gaurav Bhatia,et al.  A Covering Method for Detecting Genetic Associations between Rare Variants and Common Phenotypes , 2010, PLoS Comput. Biol..

[6]  N. Schork,et al.  Weighted Score Tests Implementing Model-Averaging Schemes in Detection of Rare Variants in Case-Control Studies , 2015, PloS one.

[7]  Eleftheria Zeggini,et al.  Rare variant association analysis methods for complex traits. , 2010, Annual review of genetics.

[8]  Eric Boerwinkle,et al.  Rare variants analysis using penalization methods for whole genome sequence data , 2015, BMC Bioinformatics.

[9]  Li Ding,et al.  Patterns and functional implications of rare germline variants across 12 cancer types , 2015, Nature Communications.

[10]  Eleazar Eskin,et al.  An Optimal Weighted Aggregated Association Test for Identification of Rare Variants Involved in Common Diseases , 2011, Genetics.

[11]  J. Pritchard Are rare variants responsible for susceptibility to complex diseases? , 2001, American journal of human genetics.

[12]  Qi Zhang,et al.  Genome-wide compatible SNP intervals and their properties , 2010, BCB '10.

[13]  Rochelle L. Garcia,et al.  Mutations in 12 genes for inherited ovarian, fallopian tube, and peritoneal carcinoma identified by massively parallel sequencing , 2011, Proceedings of the National Academy of Sciences.

[14]  Charalampos Papachristou,et al.  Evaluation of logistic Bayesian LASSO for identifying association with rare haplotypes , 2014, BMC Proceedings.

[15]  M. Wagner Rare-variant genome-wide association studies: a new frontier in genetic analysis of complex traits. , 2013, Pharmacogenomics.

[16]  A. Godwin,et al.  Increased Expression of the Pro-Protein Convertase Furin Predicts Decreased Survival in Ovarian Cancer , 2007, Cellular oncology : the official journal of the International Society for Cellular Oncology.

[17]  Jin Zhang,et al.  A probabilistic method for identifying rare variants underlying complex traits , 2013, BMC Genomics.

[18]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[19]  Benjamin J. Raphael,et al.  Mutational landscape and significance across 12 major cancer types , 2013, Nature.

[20]  Hao Hu,et al.  Detecting Statistical Interaction Between Somatic Mutational Events and Germline Variation from Next-Generation Sequence Data , 2013, Pacific Symposium on Biocomputing.

[21]  S. Weissman,et al.  Genetic Testing by Cancer Site: Ovary , 2012, Cancer journal.

[22]  Eleazar Eskin,et al.  EMINIM: An Adaptive and Memory-Efficient Algorithm for Genotype Imputation , 2010, J. Comput. Biol..

[23]  Doris Berger,et al.  International Cancer Genome Consortium , 2013, Im Focus Onkologie.

[24]  Benjamin J. Raphael,et al.  Integrated Analysis of Germline and Somatic Variants in Ovarian Cancer , 2014, Nature Communications.

[25]  Anbupalam Thalamuthu,et al.  Association tests using kernel‐based measures of multi‐locus genotype similarity between individuals , 2009, Genetic epidemiology.