Finding edging genes from microarray data.

MOTIVATION A set of genes and their gene expression levels are used to classify disease and normal tissues. Due to the massive number of genes in microarray, there are a large number of edges to divide different classes of genes in microarray space. The edging genes (EGs) can be co-regulated genes, they can also be on the same pathway or deregulated by the same non-coding genes, such as siRNA or miRNA. Every gene in EGs is vital for identifying a tissue's class. The changing in one EG's gene expression may cause a tissue alteration from normal to disease and vice versa. Finding EGs is of biological importance. In this work, we propose an algorithm to effectively find these EGs. RESULT We tested our algorithm with five microarray datasets. The results are compared with the border-based algorithm which was used to find gene groups and subsequently divide different classes of tissues. Our algorithm finds a significantly larger amount of EGs than does the border-based algorithm. As our algorithm prunes irrelevant patterns at earlier stages, time and space complexities are much less prevalent than in the border-based algorithm. AVAILABILITY The algorithm proposed is implemented in C++ on Linux platform. The EGs in five microarray datasets are calculated. The preprocessed datasets and the discovered EGs are available at http://www3.it.deakin.edu.au/~phoebe/microarray.html.

[1]  Chengqi Zhang,et al.  Detecting inconsistency in biological molecular databases using ontologies , 2007, Data Mining and Knowledge Discovery.

[2]  Jiyuan An,et al.  DDR: an index method for large time-series datasets , 2005, Inf. Syst..

[3]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[4]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[5]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[6]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[7]  Blaz Zupan,et al.  Data and text mining Visualization-based cancer microarray data classification analysis , 2007 .

[8]  Alan Wee-Chung Liew,et al.  Microarray Data Analysis , 2005 .

[9]  Peter Natesan Pushparaj,et al.  SHORT INTEFERING RNA (siRNA) AS A NOVEL THERAPEUTIC , 2006, Clinical and experimental pharmacology & physiology.

[10]  William A. Schmitt,et al.  Interactive exploration of microarray gene expression patterns in a reduced dimensional space. , 2002, Genome research.

[11]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[12]  John T Ellis,et al.  The design and analysis of microarray experiments: applications in parasitology. , 2003, DNA and cell biology.

[13]  G. Getz,et al.  Outcome signature genes in breast cancer: is there a unique set? , 2005, Breast Cancer Research.

[14]  R. Russell,et al.  Principles of MicroRNA–Target Recognition , 2005, PLoS biology.

[15]  Anton J. Enright,et al.  Human MicroRNA Targets , 2004, PLoS biology.

[16]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[17]  Frédéric Maire,et al.  MDSM: Microarray database schema matching using the Hungarian method , 2006, Inf. Sci..

[18]  Yi-Ping Phoebe Chen,et al.  Bioinformatics Technologies , 2005 .

[19]  C. Croce,et al.  MicroRNA-cancer connection: the beginning of a new tale. , 2006, Cancer research.

[20]  Huiqing Liu,et al.  Discovery of significant rules for classifying cancer diagnosis data , 2003, ECCB.

[21]  Jinyan Li,et al.  Mining border descriptions of emerging patterns from dataset pairs , 2005, Knowledge and Information Systems.

[22]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[23]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Edwin Wang,et al.  Global analysis of microRNA target gene expression reveals that miRNA targets are lower expressed in mature mouse and Drosophila tissues than in the embryos , 2006, Nucleic acids research.

[25]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[26]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.