论文信息 - GenMiner: Mining Informative Association Rules from Genomic Data

GenMiner: Mining Informative Association Rules from Genomic Data

GENMINER is a smart adaptation of closed itemsets based association rules extraction to genomic data. It takes advantage of the novel NORDI discretization method and of the CLOSE [27] algorithm to efficiently generate min- imal non-redundant association rules. GENMINER facili- tates the integration of numerous sources of biological in- formation such as gene expressions and annotations, and can tacitly integrate qualitative information on biological conditions (age, sex, etc.). We validated this approach ana- lyzing the microarray datasets used by Eisen et al. [10] with several sources of biological annotations. Extracted asso- ciations revealed significant co-annotated and co-expressed gene patterns, showing important biological relationships between genes and their features. Several of these relation- ships are supported by recent biological literature.

[1] M. Eisen,et al. Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering , 2002, Genome Biology.

[2] José María Carazo,et al. Integrated analysis of gene expression by association rules discovery , 2006, BMC Bioinformatics.

[3] T. Speed,et al. GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[4] Ricardo Martínez,et al. Extracted Knowledge Interpretation in mining biological data: a survey , 2007, RCIS.

[5] Nicola J. Rinaldi,et al. Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[6] Joaquín Dopazo,et al. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes , 2004, Bioinform..

[7] Ricardo Martínez,et al. Co-expressed gene groups analysis (CGGA): An automatic tool for the interpretation of microarray experiments , 2006 .

[8] Roderick J. A. Little,et al. Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[9] Rakesh Agarwal,et al. Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[10] B. Dyson,et al. Running head: , 2019 .

[11] Gediminas Adomavicius,et al. Handling very large numbers of association rules in the analysis of microarray data , 2002, KDD.

[12] Seon-Young Kim,et al. PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[13] David Shore,et al. Fine-Structure Analysis of Ribosomal Protein Gene Transcription , 2006, Molecular and Cellular Biology.

[14] Jiong Yang,et al. Gene ontology friendly biclustering of expression profiles , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[15] Nicolas Pasquier,et al. Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[16] Claude Pasquier,et al. THEA: ontology-driven analysis of microarray data , 2004, Bioinform..

[17] David Martin,et al. GOToolBox: functional analysis of gene datasets based on Gene Ontology , 2004, Genome Biology.

[18] Dan A. Simovici,et al. Generating an informative cover for association rules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[19] D. Botstein,et al. Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20] Jerry Li,et al. Within the fold: assessing differential expression measures and reproducibility in microarray assays , 2002, Genome Biology.

[21] Chad Creighton,et al. Mining gene expression databases for association rules , 2003, Bioinform..

[22] Petri Törönen,et al. Theme discovery from gene lists for identification and viewing of multiple functional groups , 2005, BMC Bioinformatics.

[23] Rajeev Motwani,et al. Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[24] Anil K. Bera,et al. Efficient tests for normality, homoscedasticity and serial independence of regression residuals: Monte Carlo Evidence , 1981 .

[25] Kian-Lee Tan,et al. Mining gene expression data for positive and negative co-regulated gene clusters , 2004, Bioinform..

[26] Huiming Ding,et al. The synthetic genetic interaction spectrum of essential genes , 2005, Nature Genetics.

[27] F. E. Grubbs. Procedures for Detecting Outlying Observations in Samples , 1969 .

[28] P. Brown,et al. Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[29] Stanley N Cohen,et al. Effects of threshold choice on biological conclusions reached during analysis of gene expression by DNA microarrays. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[30] R. Altman,et al. Whole-genome expression analysis: challenges beyond clustering. , 2001, Current opinion in structural biology.

[31] Stefan Kramer,et al. Analyzing microarray data using quantitative association rules , 2005, ECCB/JBI.

[32] Gerd Stumme,et al. Generating a Condensed Representation for Association Rules , 2005, Journal of Intelligent Information Systems.

[33] Daniel Hanisch,et al. Co-clustering of biological networks and gene expression data , 2002, ISMB.

[34] H. Lilliefors. On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown , 1967 .

[35] Daniel L. Hartl,et al. GeneMerge - Post-genomic Analysis, Data Mining, and Hypothesis Testing , 2003, Bioinform..

[36] Philip S. Yu,et al. A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[37] Hagit Shatkay,et al. Genes, Themes, and Microarrays: Using Information Retrieval for Large-Scale Gene Analysis , 2000, ISMB.

[38] R. Morse,et al. RAP, RAP, open up! New wrinkles for RAP1 in yeast. , 2000, Trends in genetics : TIG.

[39] Nicole A. Lazar,et al. Statistical Analysis With Missing Data , 2003, Technometrics.