Net workconstrainedc lusteringforgene microarraydata

Man y bioinformatics problems can be tackled from a fresh angle offered by the network perspective. Directly inspired by metabolic network structural studies, we propose an improved gene clustering approach for inferring gene signaling pathways from gene microarray data. Based on the construction of co-expression networks that consists of both significantly linear and nonlinear gene associations together with controlled biological and statistical significance, our approach tends to group functionally related genes into tight clusters despite their expression dissimilarities. We illustrate our approach and compare it to the traditional clustering approaches on a yeast galactose metabolism data set and a retinal gene expression data set. Our approach greatly outperforms the traditional approach in rediscovering the relatively well known galactose metabolism pathway in yeast and in clustering genes of the photoreceptor differentiation pathway. The clustering method has been implemented in an R package “GeneNT” that is freely available from: http://www.cran.org.

[1]  Alfred O. Hero,et al.  High Throughput Screening of Co-Expressed Gene Pairs with Controlled False Discovery Rate (FDR) and Minimum Acceptable Strength (MAS) , 2005, J. Comput. Biol..

[2]  Y. Benjamini,et al.  False Discovery Rate–Adjusted Multiple Confidence Intervals for Selected Parameters , 2005 .

[3]  George C Tseng,et al.  Tight Clustering: A Resampling‐Based Approach for Identifying Stable and Tight Patterns in Data , 2005, Biometrics.

[4]  W. Wong,et al.  Functional annotation and network reconstruction through cross-platform integration of microarray data , 2005, Nature Biotechnology.

[5]  S. AdhiHarmoko,et al.  Introduction to Algorithms , 2005 .

[6]  Dongxiao Zhu,et al.  BMC Bioinformatics BioMed Central , 2005 .

[7]  An-Ping Zeng,et al.  Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach , 2004, BMC Bioinformatics.

[8]  An-Ping Zeng,et al.  Decomposition of metabolic network into functional modules based on the global connectivity structure of reaction graph , 2004, Bioinform..

[9]  G. Gibson,et al.  Cross-species comparison of genome-wide expression patterns , 2004, Genome Biology.

[10]  Homin K. Lee,et al.  Coexpression analysis of human genes across many microarray data sets. , 2004, Genome research.

[11]  Alfred O. Hero,et al.  Multicriteria Gene Screening for Analysis of Differential Expression with DNA Microarrays , 2004, EURASIP J. Adv. Signal Process..

[12]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[13]  An-Ping Zeng,et al.  The Connectivity Structure, Giant Strong Component and Centrality of Metabolic Networks , 2003, Bioinform..

[14]  Julien Gagneur,et al.  Hierarchical Analysis of Dependency in Metabolic Networks , 2003, Bioinform..

[15]  D. Edwards,et al.  Statistical Analysis of Gene Expression Microarray Data , 2003 .

[16]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[17]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[18]  Roger E Bumgarner,et al.  Clustering gene-expression data with repeated measurements , 2003, Genome Biology.

[19]  W. Wong,et al.  Transitive functional annotation by shortest-path analysis of gene expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[21]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[22]  A. Butte,et al.  Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[23]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[24]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[25]  Trey Ideker,et al.  Testing for Differentially-Expressed Genes by Maximum-Likelihood Analysis of Microarray Data , 2000, J. Comput. Biol..

[26]  C. Hollenberg,et al.  Concurrent knock‐out of at least 20 transporter genes is required to block uptake of hexoses in Saccharomyces cerevisiae , 1999, FEBS letters.

[27]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[29]  A. Jackson,et al.  A conserved retina-specific gene encodes a basic motif/leucine zipper domain. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[30]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[31]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[32]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .