Interactive Analysis of Gene Interactions Using Graphical gaussian model

DNA microarray provides a powerful basis for analysis of gene expression. Data mining methods such as clustering have been widely applied to microarray data to link genes that show similar expression patterns. However, this approach usually fails to unveil gene-gene interactions in the same cluster. Association rule mining and loglinear models have been used for this purpose, but their inherent limitations as well as information loss due to discretization limit the applicability of the results. Here we propose the use of a Graphical Gaussian Model to discover pairwise gene interactions. We have constructed a prototype system that permits rapid interactive exploration of gene relationships; results can be validated by experts or known information, or suggest new experiments. We have tested our methodology using the yeast microarray data. Our results reveal some previously unknown interactions that have solid biological explanations.

[1]  Roded Sharan,et al.  CLICK: A Clustering Algorithm for Gene Expression Analysis , 2000, ISMB 2000.

[2]  Peter D. Karp,et al.  The EcoCyc Database , 2002, Nucleic Acids Res..

[3]  Xintao Wu,et al.  Gene Interaction Analysis Using k-way Interaction Loglinear Model: A Case Study on Yeast Data , 2003, ICML 2003.

[4]  Michael I. Jordan Graphical Models , 1998 .

[5]  Pak Chung Wong,et al.  Visualizing association rules for text mining , 1999, Proceedings 1999 IEEE Symposium on Information Visualization (InfoVis'99).

[6]  Susumu Goto,et al.  The KEGG databases at GenomeNet , 2002, Nucleic Acids Res..

[7]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[8]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[9]  Ying Xu,et al.  Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees , 2002, Bioinform..

[10]  Steven C. Lawlor,et al.  GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways , 2002, Nature Genetics.

[11]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000 .

[12]  Mohammed J. Zaki,et al.  Efficiently mining maximal frequent itemsets , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[13]  M. Werner-Washburne,et al.  The Highly Conserved, Coregulated SNOand SNZ Gene Families in Saccharomyces cerevisiaeRespond to Nutrient Limitation , 1998, Journal of bacteriology.

[14]  H Kishino,et al.  Correspondence analysis of genes and tissue types and finding genetic links from microarray data. , 2000, Genome informatics. Workshop on Genome Informatics.

[15]  Enrique Herrero,et al.  Functional analysis of yeast gene families involved in metabolism of vitamins B1 and B6 , 2002, Yeast.

[16]  D. Kinney,et al.  Arginine restriction induced by delta-N-(phosphonacetyl)-L-ornithine signals increased expression of HIS3, TRP5, CPA1, and CPA2 in Saccharomyces cerevisiae , 1989, Molecular and cellular biology.

[17]  C. Becquet,et al.  Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data , 2002, Genome Biology.

[18]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000, Softw. Pract. Exp..

[19]  Xintao Wu,et al.  Screening and interpreting multi-item associations based on log-linear modeling , 2003, KDD '03.

[20]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[21]  Ron Shamir,et al.  A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..

[22]  Ron Shamir,et al.  Clustering Gene Expression Patterns , 1999, J. Comput. Biol..

[23]  Peter D. Karp,et al.  The MetaCyc Database , 2002, Nucleic Acids Res..

[24]  G. Mittenhuber,et al.  Phylogenetic analyses and comparative genomics of vitamin B6 (pyridoxine) and pyridoxal phosphate biosynthesis pathways. , 2001, Journal of molecular microbiology and biotechnology.

[25]  David Heckerman,et al.  Bayesian Networks for Data Mining , 2004, Data Mining and Knowledge Discovery.

[26]  Chad Creighton,et al.  Mining gene expression databases for association rules , 2003, Bioinform..

[27]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..