Pyramidic Clustering of Large-Scale Microarray Images

With ongoing research and development of imaging techniques such as those involved in brain MRIs, cDNA microarrays and satellite reconnaissance, the need for tools that can intelligently parse larger images is ever increasing. One group of such techniques often used is that of segmentation, an example of which is that of clustering algorithms. In order to deal with large data sets, current approaches require the data to be sampled or summarized before true analysis can take place. In this paper we propose a novel image analysis technique using pyramidic type grouping, namely copasetic clustering, which focuses on the problem of applying traditional clustering techniques to these large-scale image data sets with limited resources. A further benefit of the technique is the transparency of its intermediate clustering steps; when applied to spatial data sets this allows the capture and incorporation of contextual information to improve result accuracy. The algorithm achieves an ∼1--3 dB w-to-noise ratio when compared with the conventional techniques described.

[1]  Terence P. Speed,et al.  Comparison of Methods for Image Analysis on cDNA Microarray Data , 2002 .

[2]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[3]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[4]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[5]  R.S.H. Istepanian,et al.  Application of wavelet modulus maxima in microarray spots recognition , 2003, IEEE Transactions on NanoBioscience.

[6]  M. Narasimha Murty,et al.  A computationally efficient technique for data-clustering , 1980, Pattern Recognit..

[7]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[8]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  K Fraser,et al.  Copasetic analysis: a framework for the blind analysis of microarray imagery. , 2004, Systems biology.

[11]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[12]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[13]  Rolf Adams,et al.  Seeded Region Growing , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Theodore Johnson,et al.  Squashing flat files flatter , 1999, KDD '99.

[15]  Emanuel F. Petricoin,et al.  Medical applications of microarray technologies: a regulatory science perspective , 2002, Nature Genetics.

[16]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[17]  Ching Y. Suen,et al.  A recursive thresholding technique for image segmentation , 1998, IEEE Trans. Image Process..

[18]  Jörg Rahnenführer,et al.  Unsupervised technique for robust target separation and analysis of DNA microarray spots through adaptive pixel clustering , 2002, Bioinform..

[19]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20]  G. Churchill Fundamentals of experimental design for cDNA microarrays , 2002, Nature Genetics.

[21]  Franz Kummert,et al.  A Markov Random Field model of microarray gridding , 2003, SAC '03.

[22]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[23]  Allan Tucker,et al.  A framework for modelling virus gene expression data , 2002, Intell. Data Anal..

[24]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[25]  Andrew J. Holloway,et al.  Options available—from start to finish—for obtaining data from DNA microarrays II , 2002, Nature Genetics.

[26]  Radhakrishnan Nagarajan,et al.  Identifying spots in microarray images. , 2002, IEEE transactions on nanobioscience.

[27]  Azriel Rosenfeld,et al.  Neighbor gray levels as features in pixel classification , 1980, Pattern Recognit..

[28]  Chin-Der Wann,et al.  A Comparative Study of Self-organizing Clustering Algorithms Dignet and ART2 , 1997, Neural Networks.

[29]  Xiaohui Liu,et al.  Mining gene expression data , 2003 .

[30]  Bradley Coe,et al.  You want Ketchup with your DNA Chips? An Overview of Expression Microarrays , 2003 .

[31]  W. A. Perkins,et al.  Area Segmentation of Images Using Edge Points , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[33]  G. Duyk,et al.  Sharper tools and simpler methods , 2002, Nature Genetics.

[34]  Boris G. Mirkin,et al.  Least-Squares Structuring, Clustering and Data Processing Issues , 1998, Comput. J..