Unsupervised technique for robust target separation and analysis of DNA microarray spots through adaptive pixel clustering

MOTIVATION Microarray images challenge existing analytical methods in many ways given that gene spots are often comprised of characteristic imperfections. Irregular contours, donut shapes, artifacts, and low or heterogeneous expression impair corresponding values for red and green intensities as well as their ratio R/G. New approaches are needed to ensure accurate data extraction from these images. RESULTS Herein we introduce a novel method for intensity assessment of gene spots. The technique is based on clustering pixels of a target area into foreground and background. For this purpose we implemented two clustering algorithms derived from k-means and Partitioning Around Medoids (PAM), respectively. Results from the analysis of real gene spots indicate that our approach performs superior to other existing analytical methods. This is particularly true for spots generally considered as problematic due to imperfections or almost absent expression. Both PX(PAM) and PX(KMEANS) prove to be highly robust against various types of artifacts through adaptive partitioning, which more correctly assesses expression intensity values. AVAILABILITY The implementation of this method is a combination of two complementary tools Extractiff (Java) and Pixclust (free statistical language R), which are available upon request from the authors.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Linda G. Shapiro,et al.  Image Segmentation Techniques , 1984, Other Conferences.

[3]  Josef A. Mazanec,et al.  Reduction of Complexity , 2000 .

[4]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[5]  Mia Hubert,et al.  Integrating robust clustering techniques in S-PLUS , 1997 .

[6]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[7]  Klaus Pötzelberger,et al.  CLUSTERING AND QUANTIZATION BY MSP-PARTITIONS , 2001 .

[8]  Y. Chen,et al.  Ratio-based decisions and the quantitative analysis of cDNA microarray images. , 1997, Journal of biomedical optics.

[9]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[10]  Rolf Adams,et al.  Seeded Region Growing , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  P. Brown,et al.  A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. , 1996, Genome research.

[12]  H. Bock,et al.  A Clustering Technique for Maximizing φ-Divergence, Noncentrality and Discriminating Power , 1992 .

[13]  Jörg Rahnenführer,et al.  Multivariate permutation tests for the k-sample problem with clustered data , 2002, Comput. Stat..

[14]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[15]  Terence P. Speed,et al.  Comparison of Methods for Image Analysis on cDNA Microarray Data , 2002 .

[16]  Micha Sharir,et al.  The Discrete 2-Center Problem , 1997, SCG '97.

[17]  A. D. Gordon A survey of constrained classification , 1996 .

[18]  M. Bittner,et al.  Expression profiling using cDNA microarrays , 1999, Nature Genetics.

[19]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20]  P. Sorger,et al.  Image metrics in the statistical analysis of DNA microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[22]  Martin Schader,et al.  Analyzing and Modeling Data and Knowledge , 1992 .

[23]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[24]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[25]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .