Finding susceptible and protective interaction patterns in large-scale genetic association study

Interaction detection in large-scale genetic association studies has attracted intensive research interest, since many diseases have complex traits. Various approaches have been developed for finding significant genetic interactions. In this article, we propose a novel framework SRMiner to detect interacting susceptible and protective genotype patterns. SRMiner can discover not only probable combination of single nucleotide polymorphisms (SNPs) causing diseases but also the corresponding SNPs suppressing their pathogenic functions, which provides a better prospective to uncover the underlying relevance between genetic variants and complex diseases. We have performed extensive experiments on several real Wellcome Trust Case Control Consortium (WTCCC) datasets. We use the pathway-based and the protein-protein interaction (PPI) network-based evaluation methods to verify the discovered patterns. The results show that SRMiner successfully identifies many disease-related genes verified by the existing work. Furthermore, SRMiner can also infer some uncomfirmed but highly possible disease-related genes.

[1]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[2]  Chunyu Wang,et al.  A gene-based information gain method for detecting gene–gene interactions in case–control studies , 2015, European Journal of Human Genetics.

[3]  Montgomery Slatkin,et al.  Linkage disequilibrium — understanding the evolutionary past and mapping the medical future , 2008, Nature Reviews Genetics.

[4]  Maozu Guo,et al.  Mining disease genes using integrated protein–protein interaction and gene–gene co-regulation information , 2015, FEBS open bio.

[5]  Qiang Yang,et al.  Predictive rule inference for epistatic interaction detection in genome-wide association studies , 2010, Bioinform..

[6]  Q. Zou,et al.  Similarity computation strategies in the microRNA-disease network: a survey. , 2015, Briefings in functional genomics.

[7]  Xiang Zhang,et al.  TEAM: efficient two-locus epistasis tests in human genome-wide association study , 2010, Bioinform..

[8]  Andrew V. Goldberg,et al.  Finding a Maximum Density Subgraph , 1984 .

[9]  Cornelia M van Duijn,et al.  Genome-based prediction of common diseases: advances and prospects. , 2008, Human molecular genetics.

[10]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[11]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[12]  Xiangxiang Zeng,et al.  Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks , 2016, Briefings Bioinform..

[13]  Lin S. Chen,et al.  Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data. , 2010, American journal of human genetics.

[14]  Guoren Wang,et al.  Finding Novel Diagnostic Gene Patterns Based on Interesting Non-redundant Contrast Sequence Rules , 2011, 2011 IEEE 11th International Conference on Data Mining.

[15]  Hao Wang,et al.  PaGeFinder: quantitative identification of spatiotemporal pattern genes , 2012, Bioinform..

[16]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[17]  Xuejun Liu,et al.  Detecting differential expression from RNA-seq data with expression measurement uncertainty , 2015, Frontiers of Computer Science.

[18]  Guimei Liu,et al.  An empirical comparison of several recent epistatic interaction detection methods , 2011, Bioinform..

[19]  Onofre Combarros,et al.  Gene-gene interaction between interleukin-1A and interleukin-8 increases Alzheimer’s disease risk , 2004, Journal of Neurology.

[20]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[21]  Dan Liu,et al.  Performance analysis of novel methods for detecting epistasis , 2011, BMC Bioinformatics.

[22]  Kai Wang,et al.  Pathway-based approaches for analysis of genomewide association studies. , 2007, American journal of human genetics.

[23]  D. Goldstein Common genetic variation and human traits. , 2009, The New England journal of medicine.

[24]  Mario Cortina-Borja,et al.  Open Access Journal of Neuroinflammation Replication by the Epistasis Project of the Interaction between the Genes for Il-6 and Il-10 in the Risk of Alzheimer's Disease , 2022 .

[25]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[26]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[27]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[28]  Scott F. Saccone,et al.  Bioinformatics Applications Note Databases and Ontologies Bioq: Tracing Experimental Origins in Public Genomic Databases Using a Novel Data Provenance Model , 2022 .

[29]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[30]  Johnny S. H. Kwan,et al.  HYST: a hybrid set-based test for genome-wide association studies, with application to protein-protein interaction-based association analysis. , 2012, American journal of human genetics.

[31]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[32]  Q. Zou,et al.  An overview of SNP interactions in genome-wide association studies. , 2015, Briefings in functional genomics.

[33]  Philip S. Yu,et al.  Direct mining of discriminative and essential frequent patterns via model-based search tree , 2008, KDD.

[34]  Anthony K. H. Tung,et al.  Carpenter: finding closed patterns in long biological datasets , 2003, KDD '03.

[35]  C. Myers,et al.  Genetic interaction networks: toward an understanding of heritability. , 2013, Annual review of genomics and human genetics.

[36]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[37]  T. Pawson,et al.  Protein-protein interactions define specificity in signal transduction. , 2000, Genes & development.

[38]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[39]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[40]  Todd Holden,et al.  A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. , 2006, Journal of theoretical biology.