Median correlation for the analysis of gene expression data

Microarray technology is revolutionizing functional genomics research by allowing scientists to measure the expression level of thousands of genes simultaneously from a single sample. However, a standard protocol for microarray data analysis has yet to be established. Many analysis techniques currently rely on linear correlation between pairs of genes. Such analysis only detects relationships in signal components exhibiting Gaussian correlation statistics. This paper focuses on determining the relationships between gene patterns based on a hierarchical clustering of nonlinear correlation measurements. The methods described herein are illustrated with gene expression data from yeast, and from human cancer cell lines. The results indicate that in some cases, improved clustering of genes can be achieved by the use of nonlinear correlation metrics.

[1]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[2]  P. D’haeseleer,et al.  Mining the gene expression matrix: inferring gene relationships from large scale gene expression data , 1998 .

[3]  Hongyu Zhao,et al.  Assessing reliability of gene clusters from gene expression data , 2000, Functional & Integrative Genomics.

[4]  Robert F. Ling,et al.  Classification and Clustering. , 1979 .

[5]  Gonzalo R. Arce,et al.  Median power and median correlation theory , 2002, IEEE Trans. Signal Process..

[6]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Roland Somogyi,et al.  Genetic network inference , 2000 .

[8]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[9]  Gonzalo R. Arce,et al.  Weighted Median Filters , 2005 .

[10]  M. Holcombe,et al.  Information Processing in Cells and Tissues , 1998, Springer US.

[11]  S. Hilsenbeck,et al.  Statistical analysis of array expression data as applied to the problem of tamoxifen resistance. , 1999, Journal of the National Cancer Institute.

[12]  C. L. Nikias,et al.  Signal processing with alpha-stable distributions and applications , 1995 .

[13]  Gonzalo R. Arce,et al.  A general weighted median filter structure admitting negative weights , 1998, IEEE Trans. Signal Process..

[14]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[16]  R. Sokal Clustering and Classification: Background and Current Directions , 1977 .

[17]  Moncef Gabbouj,et al.  Weighted median filters: a tutorial , 1996 .

[18]  Roded Sharan,et al.  Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis , 2000, ISMB.